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(57) Abstract 

A methcxl for obtaining longer cDNA sequences is provided. The method utilizes a known genomic DNA sequence or a partial 
cDNA sequence, such as can be obtained from GenBank partial cDNAs. Two PCR primers are designed to correspond to the ends of the 
known partial sequence and to anneal to DNA in a cDNA library so as to initiate extension away from the known cDNA and the other 
primer. The primers are added to a cDNA library with appropriate enzymes and extend through additional DNA sequence to produce PCR 
products, which are subsequently purified and sequenced to provide new sequences. The new sequences are then compared with the known 
panial cDNA sequence for areas of overlap, and the sequence is extended beyond the overlapping areas to provide longer DNA sequence. 
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IMPROVED METHOD FOR OBTAINING FUIX-LENGTH cDNA SEQUENCES 

TECHNICAL FIELD 

The present invention is in the field of molecular biology 
5 and more particularly, in the field of recombinant DNA technology. 

BACKGROUND ART 

PGR has become a widely used nucleic acid amplification 
technique since it was first presented by Kary Mullis at the Cold 

10 Spring Harbor Symposium (Mullis K et al (1986) Cold Spring Harbor 
Symp Quant Biol 51: 263-273) - PGR requires that a pair of primers 
be generated from known sequences. However, in many cases, 
sequence is available only from one end of a DNA segment. Several 
methods have been developed to sequence an entire gene once a 

15 partial nucleotide sequence is available. As more partial cDNA 

sequences become available in the world' s genetic databanks, more 
efficient and economical methods will be sought for then obtaining 
the complete gene. 

PGR has become a widely used technique to complete genes for 

20 which a partial sequence is already known. Gene-specific primers 
and primers located in the vector into which the cDNAs have been 
cloned are used for this purpose. However, this method is limited 
by the use of primers complementary to vector sequence which is 
common to all clones in the library. This results in an abundance 

25 of non-specific PCR-products which have to be cloned and 

sequenced. Multiple rounds of amplifications with nested primers 
might be required. These additional operations increase the 
incorporation of errors . 

Gobinda, Turner and Bolander (1993) in PGR Methods and 

30 Applications 2:318-22 disclose ^^restriction-site PGR" as a direct 
method of retrieving unknown sequence which is adjacent to a known 
locus by using universal primers. First, genomic DNA is amplified 
in the presence of restriction site oligonucleotides and a primer 
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specific to the known region. Next, those products are subjected 
to a second round of PGR with the same restriction site 
oligonucleotides and another specific primer internal to the first 
one. Subsequently, the products of the last round of PGR are 
5 transcribed with an appropriate RNA polymerase and sequenced with 
a reverse transcriptase and an end-labeled specific primer 
internal to the second specific PGR primer, Gobinda et al. 
present data concerning Factor IX for which they identified a 
conserved stretch of 20 nucleotides in the 3' noncoding region of 
10 the gene. 

Inverse PGR is the first method that reported successful 
acquisition of unknown sequences starting with primers based on a 
known region (Triglia T, Peterson MG, and Kemp DJ (1988) Nucleic 
Acids Res. 16:8186). Inverse PGR employs a strategy in which 

15 several restriction enzymes are used to generate a suitable 

fragment in the known region. The segment is then circularized by 
intramolecular ligation and used as a PGR template with divergent 
primers created from the known region. However, the requirement 
of multiple restriction enzyme digestions followed by multiple 

20 ligations (even before PGR is started) make the procedure slow and 
expensive (Gobinda et al. Supra) . 

Gapture PGR, first disclosed by Lagerstrom M, Parik J, 
Malmgren H, Stewart J, Patterson U and Landegren U (1991) PGR 
Methods Applic. 1:111-19, is a method for PGR amplification of DNA 

25 fragments adjacent to a known sequence in human and YAG DNA. As 
noted by Gobinda et al . supra, that method also requires multiple 
restriction enzyme digestions and ligation of an engineered 
double-stranded primer before PGR. Although the restriction and 
ligation reactions are carried out simultaneously in this method, 

30 the requirement of extension reaction, immobilization of the 

extended product, two rounds of PGR and purification of template 
prior to sequencing render it cumbersome and time consuming as 
well . 

- 2 - 
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Walking PGR, disclosed by Parker JD, Rabinovitch PS, and 
Burmer GC (1991) Nucleic Acids Res 19:3055-60, teaches a method 
for targeted gene walking via PGR, Although this method also 
permits retrieval of unknown sequence, Gobinda et al, supra r note 
5 that it requires oligomer-extension assay followed by 
* identification and gel purification of the desired band prior to 

sequencing. Such extra steps again limit the applicability of the 
method - 

The enzymes originally used in PGR were limited in their 
10 ability to reliably amplify long pieces of nucleic acids over 3kb. 
One of the explanations for this limitation seems to be the 
misincorporation of nucleotides resulting in non-basepairing 
mismatches which these enzymes often fail to extend. 

Only the mixture of two enzymes, rTth DNA- Polymerase and 
15 Vent, the latter of which has so-called "proofreading" activity, 
and the optimization of amplification conditions finally overcame 
this limitation and made amplification of pieces of DNA of up to 
40kb possible. 

The most common way to identify genes expressed in a certain 
20 tissue at a certain time is the isolation of the mRNA of that 

particular tissue and the conversion of this mRNA into so-called 
cDNA (complementary DNA) . This cDNAs are subsequently cloned into 
a vector (plasmid or Lambda) and amplified by transfection into 
E . coli cells resulting in a so-called cDNA library. 
25 First and most important to researchers attempting to obtain 

a complete gene is that the enzymes used in converting mRNA into 
cDNA are limited in their ability to produce complete copies of 
the existing mRNAs . This requires the researcher to isolate 
multiple cDNA clones of the gene of interest using specific probes 
30 and analyze each of these isolates for a complete cDNA of the gene 
of interest. This process is called screening of cDNA libraries, 

A major problem facing molecular biologists is finding the 
most efficient method to use to obtain a full-length cDNA from a 

- 3 - 
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15 



partial sequence. Such sequences are appearing with increasing 
frequency in GenBank, from commercial cDNA libraries and privately 
prepared libraries. The inventive method disclosed herein is a 
contribution to that art. 
5 DISCLOSURE OP THE INVENTION 

An improved method for extending the DNA sequence of a known 
fragment of DNA sequence is provided. The method may be used for 
extending known DNA sequences of genomic or cDNA origin. The 
method utilizes the polymerase chain reaction (PGR) and includes 
10 the steps of: 

a) combining a first and second PGR primer with nucleic acid 
from a cDNA library, or pools of cDNA libraries, expected to 
contain said partial cDNA, or said partial cDNA that has been 
extended, or a genomic library, under conditions suitable for 
synthesis of nucleic acid PGR products from the first and second 
primers, wherein said first and second primers are capable of 
annealing to opposite strands of the partial cDNA or genomic DNA 
and initiating nucleic acid synthesis in an outward manner and 
wherein the first primer is capable of being extended by DNA 
polymerase in an antisense direction and the second primer is 
capable of being extended in a sense direction, 

b) purifying the PGR products, and 

c) identifying extended nucleotide sequences derived from 
said partial cDNA or said genomic DNA. In one embodiment of the 

25 present invention, the method of identifying the extended 

nucleotide sequences comprises nucleic acid sequencing. in 
another embodiment of the present invention, the method proceeds 
with repeating steps 6a through 6c on the nucleotide sequences 
identified in step 6c. 

In another embodiment of the present invention, there is a 
method for extending the nucleotide sequence of a partial 
complementary DNA (cDNA) using polymerase chain reaction (PGR), 
comprising the steps of a) combining a first and second PGR primer 

- 4 - 
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with nucleic acid from a cDNA library, or pools of cDNA libraries, 
expected to contain said partial cDNA, or said partial cDNA that 
has been extended, or a genomic DNA library, under conditions 
suitable for synthesis of nucleic acid PGR products from the first 
5 and second primers, wherein said first and second primers are 

capable of annealing to opposite strands of the partial cDNA and 
initiating nucleic acid synthesis in an outward manner and wherein 
the first primer is capable of being extended by DNA polymerase in 
an antisense direction and the second primer is capable of being 
10 extended in a sense direction, 

b) purifying the PGR products, 

c) ligating the purified PGR products under conditions 
suitable for the formation of circular, closed nucleic acid, 

d) transforming a host cell with the circular, closed nucleic 
15 acid and culturing the transformed host ceil under conditions 

suitable for growth, 

e) recovering said circular closed nucleic acid from the 
cultured, transformed host cell, and 

f) identifying extended nucleotide sequences derived from 
20 said partial cDNA or said genomic DNA. 

The present invention also provides a method for extending 
known genomic DNA sequences which may be used for the detection 
and amplification of 5' untranslated nucleotide sequences and/or 
promoter sequences. 
25 . Also provided is an isolated DNA molecule comprising SEQ ID 

NO: 11, the DNA for a novel human purinergic P2U receptor. 

Also provided is an isolated DNA molecule comprising SEQ ID 
NO: 12, the DNA for a novel human C5a-like seven transmembrane 
receptor . 

30 These and other objects, advantages and features of the 

present invention will become apparent to those persons skilled in 
the art upon reading the details of the structure, synthesis, 
formulation and usage as more fully set forth below, reference 

- 5 - 
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being made to the accompanyihg figures forming a part hereof. 

BRIEF DESCRIPTION OF DRAWINGS 

Figure 1 is a flow chart of the steps in the inventive 
method - 

5 Figure 2 shows a typical plasmid obtained from the excision 

process of a lambdaZAP cDNA library. Typically 250-300 base pairs 
of the sequence are obtained in the high-throughput sequence 
operation. The clone is partially sequenced from the 5' end with 
T3 as a sequencing primer . 

10 Figure 3 is a representation of the next step, in which 

pBLUESCRIPT SK plasmids in a cDNA library are used as a template 
and the two specially designed primers (XLR and XLS) amplify 
plasmids containing the gene of interest. Only plasmids 
containing priming sites for both XL-PCR primers and the gene of 

15 interest will be amplified during the XL-PCR reaction. 

Figure 4 is a representation of the amplified DNA segments 
which have been obtained through the XL-PCR reaction and 
consequently purified after separating the products on an agarose 
gel. For best results, the cDNA library used as a template should 

20 be synthesized by random priming to assure the availability in 
this step of different amplified length of DNA (3' end) between 
the XLS priming site and the T7 priming site in the vector. The 
length of the 5' end (between the XLR priming site and the T3 
priming site) in the vector will vary in size depending on how 

25 much of the mRNA of the gene of interest had been converted into 
cDNA during the cDNA library synthesis. 

Figure 5 shows how the purified DNA segments containing the 
plasmid and the gene of interest are religated to form a circular 
plasmid and transformed into bacteria for amplification. Here 

30 chemically competent E. coli cells were transformed and grown on 

petri dishes containing LB agar and 25 mg/L carbenicillin {2XCarb) 
for antibiotic selection. 

Figure 6 shows schematically how pure samples of clones were 

- 6 - 
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obtained from the different E. coli colonies grown in the 
procedure shown in Figure 5 (also Step 1 purification. Step 2 
religation and Step 3 transformation in Figure 6) . These clones 
are screened in Step 4 for additional sequence of the gene of 
5 interest at the 5' end. For this purpose the clones were analyzed 
by a PGR reaction employing the XLR primer and the T3 vector 
primer. The size of the resulting product will indicate how much 
additional sequence upstream of the XLR priming site each clone 
contains . 

10 Figures 7A through 7H show the results of the inventive 

method, in which a partial sequence from Incyte clone 14770, which 
was similar to heat shock protein 90, was successively sequenced 
to obtain a full-length cDNA. 

Figures 8A through 8F show the results of the inventive 
15 method, in which a partial sequence from Incyte clone 87058 which 
was similar to cathepsin was successively sequenced to obtain 
extensions of the cDNA. 

MODES FOR CARRYING OUT THE INVENTION 
Unless defined otherwise, all technical and scientific terms 
20 used herein have the same meaning as is commonly understood by one 
of skill in the art to which this invention belongs. All patents 
and publications referred to herein are incorporated by reference 
herein , 

Before the present compounds, variants, formulations and 
25 methods for making and using such are described, it is to be 

understood that this invention is not limited to the particular 
compounds, variants, formulations or methods described, as such 
enzymes, formulations and methodologies may, of course, vary. The 
terminology used herein is for the purpose of describing 
30 particular embodiments only and is not intended to be limiting 

since the scope of protection will be limited only by the appended 
claims . 

In the specification and appended claims, the singular forms 

- 7 - 
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"a", ^'an" and ^"the" include plural referents unless the context 
clearly dictates otherwise. Thus, for example, reference to ^'a 
high-fidelity PGR enzyme" includes mixtures of such enzymes and 
any other enzymes fitting the stated criteria, reference to the 
5 method includes reference to one or more methods for obtaining 

full-length cDNA sequences which will be known to those skilled in 
the art or will become known to them upon reading this 
specification . 

The present method provides a way to utilize a genomic 

10 DNA library or a plasmid cDNA library (either obtained by cloning 
cDNAs directly into a plasmid vector or by converting a Lambda 
library into a plasmid library by known methods e.g. Lambda ZAP 
excision or Lambda 2IPL0CK conversion) which has been used for 
sequencing cDNAs, as a source to obtain much longer DNAs and in 

15 certain cases complete genes of partially known DNA sequences. 

The steps disclosed herein are based on cDNA libraries but equally 
apply to genomic DNA libraries , 

This new method utilizes PGR kits which enable the researcher 
to amplify long pieces of DNA. The XL-PCR amplification kit 

20 (Per kin-Elmer) was employed. However, equivalent products may be 
available from other major suppliers. This novel method allows one 
person to process multiple genes (up to 96 genes) at a time and 
obtain extended or complete sequence (possibly full-length) of the 
cDNAs of interest within 6-10 days. This compares very favorably 

25 with current competitive methods like screening with labelled 

probes which allow one worker to process only about 3-5 genes and 
obtain initial results in 14-40 days. This represents an increase 
in throughput of at least 1000%. 

This increased efficiency is possible because of the 

30 inventive combination of steps shown in the flow chart (Figure 1) . 
First, primer design and synthesis (based on a known partial 
sequence) can be performed in about two days. The PGR 
amplification can be performed in 6-8 hours. Multiple libraries 
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can be pooled and therefore screened at the same time. The next 
steps of purification and ligation take about one day. Then 
transformation and growing up the bacteria take one day. Then 
screening for clones with additional sequence of the genes of 
5 interest by PGR takes approximately five hours. The next steps of 
DNA preparation and sequencing of the selected clones can be 
performed in about one day. This totals 6-7 days. At the end of 
this time, one has usually obtained a much longer cDNA sequence, 
assuming such a longer cDNA existed in the libraries than what was 

10 initially sequenced. If the new sequence is a complete gene, then 
the goal has been reached. If the complete sequence has not been 
obtained, one still has a much longer sequence than before, and 
this longer sequence can be used to design primers to repeat the 
procedure on the same or another library. The choice of library 

15 is up to the researcher, but a preferred library is one that has 
been size-selected to include only larger cDNAs . 

This method presumes that one already has partial cDNA 
sequences, either from a publicly available database or the 
scientist' s own earlier research, including but not limited to 

20 earlier preparation of a cDNA library whose cDNAs have been 

partially sequenced. The cDNA library may have been prepared with 
oligo dT or random primers. The difference between oligo dT and 
randomly primed libraries is that a randomly primed library will 
have more sequences which contain 5' ends of cDNAs . A randomly 

25 primed library may be particularly useful for further work when 
the oligo dT library does not yield a complete gene. Random 
priming of the library also helps yield more cDNA sequences of 
different lengths. Library preparation techniques which promote 
longer insert sizes will in turn permit the sequencing of more 

30 complete cDNAs. Obviously, the larger the protein, the less 
likely it is that the complete cDNA will be found in a single 
plasmid. 

Figure 2 shows a typical plasmid containing a cDNA which had 

- 9 - 
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been partially sequenced from the 5' end with T3 as a primer . The 
top darkened portion represents the insert containing the gene of 
interest. 

St^P 1: PCR-amplification of cDN A-clones containing the gene of 

5 interest 

The first step of this method requires the design of two 
primers based on the known sequence. The known sequence can be 
obtained by those skilled in the art either by a wet lab method or 
from the many publicly available DNA databases. One primer is 

10 synthesized to be extended in an antisense direction (XLR) and the 
other in the sense direction (XLS or XLF) . In effect, the primers 
are designed to anneal to either end of the known sequence and to 
be extended '"outward" from there to generate amplicons containing 
new, unknown sequences of the genes of interest. This is 

15 different from typical PGR, in which the primers are designed to 
amplify a known sequence in a direction "inward" toward each 
other . 

The primers need to be designed in a way displaying optimal 
criteria for extra long PGR, A program like Oligo 4.0s (National 
20 Biosciences, Inc., Plymouth MN) can be employed for this purpose . 
In general primers should be 22-30 nucleotides in length, consist 
of a GC content of 50% or more and anneal at 680G-72^C to the 
target. Hairpin structures and primer-primer dimerizations must be 
avoided. 

25 Primers varying from the conditions described above may 

result in amplification of the desired targets providing extension 
conditions have been adjusted. 

Figure 3 shows the next step, in which a cDNA library is used 
as a template and the two primers (XLR and XLS) amplify plasmids 

30 containing the gene of interest. In this step, it is very helpful 

to use PGR enzymes which provide high fidelity and copy long 

sequences, such as that provided in the XL-PGR kit (Part No. 

N808-0182, Perkin Elmer, Applied Biosystems, Foster City, GA) . 

- 10 - 
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Generally, kit instructions should be followed, including 

suggestions to optimize concentrations of various reagents. In 

the examples disclosed Infra, 25pMol of each primer worked well . 

Template (plasmid library) concentrations can be varied (see 

Examples infra for details) - It is essential to thoroughly 

resuspend the enzyme in solution prior to use, especially if the 

solution has been stored at -20 'C. If the enzyme is not 

adequately resuspended, its effectiveness is impaired. The 

preferred system is setup initially in two layers, employing 

Ampliwax" PGR Gems. However, efficiency can be increased by 

avoiding the use of these Gems and initiating amplification by 

using the ''hot-start" technique by adding Magnesium, which is 

essential for amplification, at 82' C. 

Although various cycling conditions are detailed in the 

15 examples infra , the following cycling conditions have been found 

to be optimal with the MJ PCT200 thermocycler (MJ Research, 

Watertown, MA) . Times and temperatures may be varied to optimize 

conditions in different thermocyclers . 

Step 1 94* for 60 sec (initial denaturation) 
20 Step 2 94' for 15 sec 
Step 3 65° for 1 min 
Step 4 68* for 7 min 

Step 5 Repeat step 2-4 for 15 additional times 
Step 6 94 • for 15 sec 
25 Step 7 65 • for 1 min 

Step 8 68° for 7 min + 15 sec/cycle 

Step 9 Repeat step 6-8 for 11 additional times 

Step 10 72* for 8 min 

Step 11 4* for 0.00 sec (to hold at 4') 
30 At the end of these 28 cycles, 50 \il of the reaction mix is 

removed; on the remaining reaction mix, an additional 10 
additional cycles are run, as outlined below: 

Step 1 94° for 15 sec 
35 Step 2 65 • for 1 min 

Step 3 68° for (10 min + 15 sec) /cycle 

Step 4 Repeat step 1-3 for 9 additional times 

Step 5 72" for 10 min 

- 11 - 
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Next a 5-10 |il aliquot of the reaction mixture can be 
analyzed on a mini-gel to determine which reactions were . 
successful , 

Step 2 : Purification o f amolicons containing the aene of interest 

5 Figure 4 is a graphical representation of the amplified cDNA 

segments which have been separated on an agarose gel. Note that 
there are a variety of lengths of cDNA. Although the rest of the 
method could be performed using all extended cDNA species, the 
method can proceed optionally after selecting the largest products 

10 (likeliest to provide the remainder of the full-length gene) . 
Some of the larger species may in fact be hybrid clones which 
contain two cDNA inserts as a result of malfunction during the 
cDNA library construction which may represent an incomplete 
digestion with the restriction enzyme at the end of the cDNA 

15 synthesis. Such amplified hybrid clones, also called chimera, 
could result in overlooking the correct targeted extensions. 

Successful reaction products should be purified on an agarose 
gel (preferentally low agarose concentrations 0.6-0.8% should be 
used) or other appropriate method. An appropriate volume of 

20 reaction mixture should be loaded to obtain good separation of the 
products and to separate them from the plasmid library (template) 
still in the reaction mixture. Contamination with the template 
cDNA library will result in transf ormants which don't contain the 
desired gene and will require an extensive screening of many 

25 colonies. The bands representing the genes of interest are then 
cut out of the gel and purified using a method like the QIAQuick 
gel extraction kit (Qiagen, Inc., Chatsworth, CA) . 

Stgp 3 : Cloning of amr>licons containing the aene of interest 

Eventual overhangs are converted into blunt ends to 

30 facilitate religation and cloning of the products. For this 

purpose, Klenow enzyme (3 units/reaction mixture) and dNTF s (0.2 
mM final concentration) are added and the reaction is incubated at 
room temperature for 30 min. The Klenow enzyme is then 
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inactivated by incubating the reaction at 75° for 15 min. 

The products are then ethanol precipitated and redissolved in 
13 111 of ligation buffer containing 1 mM ATP. 1ml T4-DNA ligase 
(15 units) and T4 Polynucleotide kinase (5 units) are added and 
5 the reaction is incubated at room temperature for 2-3 hours or 
overnight at 16 *C. 

3|LLl of the ligation mixture are transformed into 4 0ml of 
competent E-coli cells (prepared with a standard protocol) . 80fil 
of SOC medium are added and after 1 hour of recovery of the cells 

10 at 37 the whole transformation mixture is plated on LB-agar 
2XCarb-containing petri plates. 

St?P ^- Screening of clo ned oroducta 

The next day 8 or 12 colonies are randomly picked from each 
plate and grown in individual wells of a sterile 96-well 

15 microtiter plate (e.g. 96 Well Cell Culture Cluster, Catalog No. 

3799, Costar Corp., Cambridge, MA 02140), Each well contains 150ml 
of LB/2XCarb medium. Thus, each row of the microtiter plate 
contains twelve clones from the same extension reaction. The 
cells are grown over night at 37 *C. 

20 The next day, 5 \il of these overnight cultures are tranferred . 

into a non-sterile 96-well plate (Falcon 3911 Microtest III^", 
Flexible Assay Plate, Becton Dickinson, Oxnard, CA) and diluted 
1:10 with water. 5\il of each dilution are then transferred into a 
PCR array (e.g., Cycleplate, Robbins Scientific Corp., Sunnyvale, 

25 CA) . To obtain a IX final concentration of PCR reagents, 15 \il of 
a 1.33X concentrated PCR mix are added to each well. Another way 
of efficient screening for extension products is the multiplex PCR 
method where multiple specific primers are pooled and submitted to 
the same reaction, therefore increasing the efficiency of setting 

30 up the screening mixtures. Addition of the PCR-template 

(individual cultures) has been improved by the use of a 96-pin 
tool with which an aliquot of all 96 cultures grown as described 
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above can be transferred into the PCR-screening mix in a matter of 
1-2 minutes. 

For PGR amplification, the final concentrations are IX for 
PGR mix, 5 |L1M of each of a vector primer and one or both of the 
5 gene specific primers used for the original extension reaction and 
0.75 units of Taq polymerase are added to each well. 
Amplification generally was performed using the following 
conditions : 

Step 1 94 °G for 60sec 
10 Step 2 94 for 20sec 
Step 3 55 "C for 30sec 

Step 4 72 *G for 90sec 

Step 5 repeat steps 2-4 for an additional 29 times 

Step 6 72 for ISOsec 
15 Step 7 4*C for ever 

Aliquots of these PGR reactions are run on agarose gels 

together with molecular weight markers . The size of the resulting 

PGR products will allow direct determination of how much 

additional sequence the selected clones contain compared to the 
20 original partial cDNA. The efficiency of the method has been 

further improved by using the resulting PGR-products directly for 

sequencing thus avoiding the necessity of preparing plasmids . 

The appropriate clones are selected and grown for plasmid 

preparation and sequencing. 
25 Plasmid preparations are made with standard kits familiar to 

those skilled in the art. Examples include the PROMEGA Magic 

MINIPREP and the AGTG alkaline lysis kit. 

Sequencing is performed employing standard automated ABI 

sequencing equipment and protocols using either dye-primer or 
30 dye-terminator kits . 

Sequence processing and assemblage of the sequencing data are 

performed using standard ABI software, including INHERIT^" analysis 

and the Power assembler. 
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INDUSTRIAL APPLICABILITY 

Example 1 

For the initial method evaluation, a known gene was selected. 
A partial sequence of the human 90-kDa heat-shock protein gene 
5 (HUMHSP90, accession M16660) had been identified in a THP-1 

library. This partial sequence (Incyte clone T-014201) initiated 
at base 1127 of the sequence with accession number M16660. 
1.1 Primer design 

Two primers were designed to perform the method described in 
10 the invention. 

Primer 1 (XLR) 5' AGC TGT CCA TGA TGA ACA CAC G 3' 
(1180-1159) 

Primer 2 (XLS) 5' AAT AGG CAC CAC ACC AAC TGA G 3' 
(2011-2032) 
15 1.2 Template preparation 

A THP-1 cDNA library constructed into the LambdaZAP vector 
(Stratagene) . was converted into a plasmid library following the 
mass excision protocol. Plasmids of the excised libraries were 
prepared using the Quiagen Midi plasmid purification kit. 
20 1.3 XL-PCR reaction set-up 

The extension reactions were prepared following the 
instructions provided with the GeneAmp XL PCR Kit (Part No. 
N808-0182) from Perkin Elmer. A two layer system was set up as 
follows : 

25 The lower reagent mix was prepared by pipetting the following 

components into a 0-2ml MicroAmp reaction tube. 



Lower reagent mix preparation: * 

Water 13. 6 |J.l 

30 3.3X buffer 12.0 |ll1 

dATP (lOmM) 2 . 0 ^Ll 

dCTP (lOmM) 2.0 ^Ll 

- 15 
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dGTP (lOmM) 2 . 0 ^l 

dTTP (lOmM) 2 . 0 ^ll 

Primer XLS (50|iM) 1.0 p.! 

Primer XLR (50HM) 1 . 0 Hi 

5 Mg(0Ac)2 (25mM) 4 . 4 ^1 



Total lower reagent mix 40.0 \il 

One AmpliWax^" gem was added to the tube. The wax was melted 
10 by incubating the reaction tubes at 75 *C for 5 minutes. Then the 
tubes were cooled down to 4*C. 

Upper reagent mix preparation: 
3.3X buffer 18.0 ml 

15 rTth DNA Polymerase 2 . 0 ml 



Total upper enzyme mix 20.0 \il 

20 of the enzyme/buffer mix are added to each tube and 
20 kept separated from the lower mix by the wax layer. 
Addition of template: 

The template DNA (excised library) was diluted to an 
appropriate concentration in water and then added to the upper 
mix. Mixing of the components is not necessary. 

25 

Template (6.25ng/ml) 40.0 |Xl 



Final volume 100.0 jll 

30 1.4 XL-PCR amplification 

For amplification the following protocol was employed: 
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Step 


1 


94 ror bu sec 


(initial denaturation) 


Step 


Z 


tor lo sec 




Step 




DD ror i inin 




otep 


A 

H 


DO xor / rnin 




Step 


c 
O 


Repeat step 4 


for 15 aaditional times 


Step 


c 
D 


tor lo sec 




Step 


7 


65* for 1 min 




Step 


8 


68" for 7 min + 


15 sec/cycle 


Step 


9 


Repeat step 6-8 


for 11 additional times 


Step 


10 


72* for 8 min 




Step 


11 


4 ' for 0,00 sec 


(to hold at 4*) 



1.5 Purification of amplified products 

30 \ll of the amplified products were run on a 0.7% agarose 
15 gel for 16 hours. Visible DNA bands were then cut out and purified 
using the QIAquick gel purification kit. 

1.6 Cloning of amplified products 

Klenow enzyme (3 units/reaction) and dNTP's (0.2mM final 
concentration) were added and the reactions were incubated at room 

20 temperature for 30 min followed by incubation at 75° C for 15 min. 
The products were then ethanol precipitated and redissolved in 13 
\ll of ligation buffer containing ImM ATP. T4-DNA ligase (15 units) 
and T4 Polynucleotide kinase (5 units) were added, and the 
reaction was incubated at room temperature for 3 hours . 

25 3|ll of the ligation mixture were transformed into 40 ml of 

competent E.coli cells. After heatshocking the cells at 42° C for 
4 5 seconds r 80 |ll of SOC medium were added, and the cells were 
allowed to recover at 37^ c for 1 hour. The whole transformation 
mixture then was plated on LB-agar/2XCarb-containing petri dish 

30 plates. 

1 . 7 Screening of cloned products 

The next day 10 colonies were randomly picked and grown 
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overnight in Falcon 2059 tubes (Becton Dickinson, Oxnard, CA) 
containing 3 ml of LB-broth with 2X Carb. 

5 |ll of the cultures were diluted 1:10 with water and 5 ml of 
this dilution were transferred into MicroAmp™ PGR tubes (Perkin . 
5 Elmer, Applied Biosystems, Foster City, CA) . 

15 ^il of a 1.33X concentrated PCR mix were added to each 

well . 

The 1.33 X concentrated PCR mix contained the following 
components : 

10 lOX PCR-buffer 2,0 \il 

2mM dNTPs 2.0 ^1 

Ml 3 rev primer (O.OlmM) 1.0 |Xl 

Primer 2 (XLR, O.OlmM) 1.0 |il 

Taq Polymerase 0.15 |il 

15 Water 8.85 |J.l 



Final Volume 15.0 fil 

The PCR cycling conditions were choosen as follows: 
Step 1 94 • C for 60sec 

20 Step 2 94° C for 20sec 
Step 3 55* C for 30sec 
Step 4 72° C for 90sec 

Step 5 repeat steps 2-4 for an additional 2 9 times 
Step 6 12" C for 180 sec 
25 Step 7 4* C for ever 

Aliquots of the amplified products were run on a 0.8% agarose 
gel in parallel with the 1 kb DNA ladder (Life Technologies, 
Gaithersburg, MD 20897) . Appropriate plasmids containing different 
size inserts were selected for sequencing analysis. 
30 1.8 Sequencing analyis of cloned products 

The DNA of the selected clones was prepared using the 
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WizardTM Minipreps DNA Purification System (Promega Corporation, 
Madison, WI) following the instructions of the manufacturer. 
Sequencing reactions were performed using the PRISMTM Ready 
Reaction DyeDeoxy Terminator Cycle Sequencing Kit (Part No 4 01628, 
5 Perkin Elmer, Applied Biosystems, Foster City, CA) . 
1.9 Analysis of sequenced products 

Three clones were selected for sequencing (14201.3, 14201.5, 
14201.13). The sequences obtained (SEQ ID NOS:3-5, respectively) 
were aligned using the DNASIS Multiple sequence alignment program. 

10 Clone 14201.3 initiated at base 24 of the published sequence 

(HUMHSP90), clone 14201.5 initiated at base 13 of the published 
sequence and clone 14201.13 initiated at base 538 of the published 
sequence, the original clone (14201) initiated at base 1127 of the 
published sequence . 

15 Figure 7A-7H shows an alignment of the obtained sequences 

with the published human Hsp 90 nucleotide sequence. Clones 
14201.3 and 14201,5 contain part of the 5' untranslated region and 
therefore the full coding region of the gene has been obtained. 
Example 2 

20 For further method evaluation, a second known gene was 

selected. A partial sequence from a liver library was found to be 
related to that of the human cathepsin B gene (accession L16510, 
HUMCATHB, SEQ ID NO: 6), This partial sequence (Incyte clone 
87058, SEQ ID NO: 7) initiated at base 1066 of the sequence with 

25 accession number L16510. 

2.1 Primer design 

Two primers were designed to perform the method described in 
the invention : 

Primer 1 (XLR) 5' AAG CCA TTG TCA CCC CAG TCA G 3* 
30 (1103-1082) 

Primer 2 (XLS) 5' GGT TCA CTG TGG AAT CGA ATC 3' 

(1125-1145) 

2.2 Template preparation 



BNSDOCIO: <WO 9638591 A1_t_> 



wo 96/38591 PCT/US96/08501 



A liver cDNA library constructed into the LambdaZAP vector 
(Stratagene) was converted into a plasmid library following the 
mass excision protocol. Plasmids of the excised libraries were 
prepared using the Quiagen Midi plasmid purification kit, 
5 2.3 XL-PCR reaction set-up 

The extension reactions were prepared following the 
instructions provided with the GeneAmp XL PGR Kit (Part No. 
N808-0182) from Perkin Elmer. A two layer system was set up as 
described below. The lower reagent mix was prepared by pipetting 
10 the following components into a 0.2ml MicroAmp reaction tube. 
Lower reagent mix preparation: 



Water 




13 


.6 


^1 


3.3 X buffer 




12 


.0 


m 


dATP 


(lOmM) 


2 


.0 




dCTP 


(lOmM) 


2 


.0 




dGTP 


(lOmM) 


2 


.0 


Hi 


dTTP 


(lOmM) 


2 


.0 


ill 


Primer XLS 




1 


.0 




Primer XLR 


(SO^lM) 


1 


.0 


^ll 


Mg (0Ac)2 


(25J1M) 


4 


.4 


Hi 



Total lower reagent mix 40,0 |il 



One AmpliWaxVo gem was added to the tube. This was melted by 
25 incubating the reaction tubes at 15'C for 5 minutes. Then the 
tubes were cooled down to 4"C. 
Upper reagent mix preparation: 



3.3X buffer 18.0 \ll 

30 rTth DNA Polymerase 2.0 ^il 
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10 



Total upper enzyme mix 20.0 |il 

20 |il of the enzyme/buffer mix were added to each tube and 
kept separated from the lower mix by the wax layer. 
Addition of template: 

The template DNA (excised library) was diluted to an 
appropriate concentration in water and then added to the upper 
mix. Mixing of the components is not necessary. 
Template (6.25ng/fil) 40.0 |Xl 



Final volume 100.0 fil 





2.4 


XL- 


PGR amplification 








For 


amplification the following protocol was 




Step 


1 


94' for 60 sec 


(initial denaturation) 


15 


Step 


2 


94" for 15 sec 






Step 


3 


65* for 1 min 






Step 


4 


68° for 7 min 






Step 


5 


Repeat step 2-4 


for 15 additional times 




Step 


6 


94* for 15 sec 




20 


Step 


7 


65' for 1 min 






Step 


8 


68* for 7 min -i- 


15 sec/cycle 




Step 


9 


Repeat step 6-8 


for 11 additional times 




Step 


10 


72* for 8 min 






Step 


11 


4* for 0-00 sec 


(to hold at 4*) 


25 


2.5 


Purification of amplified products 



30 \il of the amplified products were run on a 0.7% agarose 
gel for 16 hours. Visible DNA bands were then cut out and purified 
using the QIAQuick gel purification kit. 
2.6 Cloning of amplified products 
30 Klenow enzyme (3 units/reaction) and dNTP's (0.2mM final 

concentration) were added, and the reactions were incubated at 
room temperature for 30 min followed by incubation at 75 'C for 15 
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10 



min . 

The products were then ethanol precipitated and redissolved in 13 
|ll of ligation buffer containing ImM ATP. T4-DNA ligase (15 units) 
and T4 Polynucleotide kinase (5 units) were added, and the 
reaction was incubated at room temperature for 3 hours . 

3 |il of the ligation mixture were transformed into 40 ^il of 
competent E.coli cells. After heatshocking the cells at 42 *C for 
45 seconds, 80 ^ll of SOC medium were added; and the cells were 
allowed to recover at 37o C for 1 hour. The whole transformation 
mixture then was plated on LB-agar 2x Carb-containing petri 
dishes . 

2.7 Screening of cloned products 

The next day 10 colonies were randomly picked and grown 
overnight in Falcon 2059 tubes (Becton Dickinson, Oxnard, CA 
15 93030) containing 3 ml of LB-broth with 2X Carb. 

5 111 of the cultures were diluted 1:10 with water and 5 ^l of 
this dilution were transferred into MicroAmpTM PGR tubes (Perkin 
Elmer, Applied Biosystems, Foster City, CA) . 

15 |Xl of a 1.33 x concentrated PCR mix were added to each 

20 tube. 

The 1.33 x concentrated PCR mix contained the following 
components : 

10 X PCR-buffer 2.0 |Xl 

2mM dNTPs 2.0 Jll 

25 Ml 3 rev primer (O.OlmM) 1 . 0 p.1 

Primer 2 (XLR, O.OlmM) 1.0 ^.1 

Taq Polymerase 0.15 ^il 

water 8.85 jxl 



30 Final Volume 15.0 |il 

The PCR cycling conditions were as follows: 
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Step 1 94 *C for 60sec 
Step 2 94 •C for 20sec 
Step 3 55 °C for 30sec 
Step 4 72 for 90sec 
5 Step 5 repeat steps 2-4 for an additional 2 9 times 
Step 6 72 for ISOsec 
Step 7 4*C for ever 

Aliquots of the amplified products were run on a 0.8% agarose 
gel in parallel with the Ikb DNA ladder (Life Technologies, 
10 Gaithersburg, MD 20897) . Appropriate clones containing different 
size inserts were selected for sequencing analysis. 

2.8 Sequencing analyis of cloned products 

The DNA of the selected clones was prepared using the 
WizardTM Minipreps DNA Purification System (Promega Corporation, 
15 Madison, VJl) following the instructions of the manufacturer. 
Sequencing reactions were performed using the PRISMTM Ready 
Reaction DyeDeoxy Terminator Cycle Sequencing Kit (Part No 401628, 
Perkin Elmer, Applied Biosystems, Foster City, CA) . 

2.9 Analysis of sequenced products 

20 Three clones were selected for sequencing (87058.6, 87058.8, 

87058.16). The sequences obtained (SEQ IDNOS:8-10, respectively) 
were aligned using the DNASIS Multiple sequence alignment program 
and are shown in Figures 8A through 8F- Clone 87058.6 initiated 
at base 644 of the published sequence (HUMCATHB, SEQ ID NO: 6), 

25 clone 87058.8 initiated at base 353 of the published sequence and 
clone 87058.16 initiated at base 58 of the published sequence, the 
original clone (87058, SEQ ID N0:7) initiated at base 1058 of the 
published sequence. 

Figures 8A through 8F show an alignment of the obtained 

30 sequences with the published human Hsp 90 nucleotide sequence. 

Clone 87058.16 contains part of the 5'UT and therefore the full 
coding region of the gene . 
Example 3 
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In Example 3, a full length cDNA (Seq ID NO 11) of a novel 
P2U purinergic receptor homolog was obtained by the inventive 
method and is the subject of U.S. Patent Application 08/459,046 
filed June 2, 1995, which is hereby incorporated by reference, 
5 Inherit™ and BLAST search and alignment tools were used to relate 
a partial sequence found in Incyte Clone 179696 from the placental 
cDNA library to the GenBank sequence of RNU09402, a G-protein 
coupled surface receptor from rat (Rice WR et al (1995) Am J 
Respir Cell Molec Biol 12:27-32). 

10 The cDNA of Incyte 179696 was extended to full length using a 

modified XL-PCR (Perkin Elmer) procedure. Primers were designed 
based on known sequence; one primer was synthesized to initiate 
extension in the antisense direction (XLR) and the other to extend 
sequence in the sense direction (XLF) , The primers allowed the 

15 sequence to be extended "outward" from the known sequence, thus 
generating amplicons containing new, unknown nucleotide sequence 
comprising the gene of interest. The primers were designed using 
Oligo 4.0 (National Biosciences Inc, Plymouth MN) to be 22-30 
nucleotides in length, to have a GC content of 50% or more, and to 

20 anneal to the target sequence at temperatures about 68 "-72* C. 
Any stretch of nucleotides which would result in hairpin 
structures and primer-primer dimerizations was avoided. 

The cDNA library was used as a template, and XLR (bases 
278-298) and XLF (bases 587-610) primers were used to extend and 

25 amplify the 179696 sequence. By following the instructions for 
the XL-PCR kit and thoroughly mixing the enzyme, high fidelity 
amplification is obtained. Beginning with 25 pMol of each primer 
and the recommended concentrations of all other components of the 
kit, PCR was performed using the MJ PTC200 thermocycler (MJ 

30 Research, Watertown MA) and the following parameters: 
Step 1 94' C for 60 sec (initial denaturation) 

Step 2 94* C for 15 sec 

Step 3 65* C for 1 min 
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Step 4 68" C for 7 min 

Step 5 Repeat step 2-4 for 15 additional cycles 

Step 6 94' C for 15 sec 

Step 7 65" C for 1 min 

5 Step 8 68" C for 7 min + 15 sec/cycle 

Step 9 Repeat step 6-8 for 11 additional cycles 

Step 10 72" C for 8 min 

Step 11 4" C (and holding) 

At the end of 28 cycles, 50 |Lll of the reaction mix was 
ao removed; and the remaining reaction mix was run for an additional 
10 cycles as outlined below: 
Step 1 94" C for 15 sec 

Step 2 65' C for 1 min 

Step 3 68" C for (10 min + 15 sec) /cycle 

15 Step 4 Repeat step 1-3 for 9 additional cycles 

Step 5 72" C for 10 min 

A 5-10 |il aliquot of the reaction mixture was analyzed by 
electrophoresis on a low concentration (about 0.6-0.8%) agarose 
mini-gel to determine which reactions were successful in extending 

20 the sequence. Although all extensions potentally contain a full 
length gene, some of the largest products or bands were selected 
and cut out of the gel. Further purification involved using a 
commercial gel extraction method such as QIAQuick™ (QIAGEN Inc, 
Chatsworth CA) . After recovery of the DNA, Klenow enzyme was used 

25 to trim single-stranded, nucleotide overhangs creating blunt ends 
which facilitated religation and cloning. 

After ethanol precipitation, the products were redissolved in 
13 ^1 of ligation buffer. Then, l|il T4-DNA ligase (15 units) and 
l|j.l T4 polynucleotide kinase were added, and the mixture was 

30 incubated at room temperature for 2-3 hours or overnight at 16* C. 
Competent E. coli cells (in 4 0 |l11 of appropriate media) were 
transformed with 3 ^1 of ligation mixture and cultured in 80 |il of 
SOC medium (Sambrook J et al, supra) . After incubation for one 
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hour at 37" C, the whole transformation mixture was plated on 
Luria Broth (LB) -agar (Sambrook J et al^ supra) containing 
carbenicillin at 25 mg/L. The following day, 12 colonies were 
randomly picked from each plate and cultured in 150 |4,1 of liquid 
5 LB/carbenicillin medium placed in an individual well of an 

appropriate, commercially-available, sterile 96-well microtiter 
plate. The following day, 5 \il of each overnight culture was 
transferred into a non-sterile 96-well plate and after dilution 
1:10 with water, 5 fil of each sample was transferred into a PGR 
10 array. 

For PGR amplification, 15 |il of concentrated PGR reaction mix 

(1.33X) containing 0.75 units of Taq polymerase, a vector primer 

and one or both of the gene specific primers used for the 

extension reaction were added to each well. Amplification was 
15 performed using the following conditions: 

Step 1 94' C for 60 sec 

Step 2 94° C for 20 sec 

Step 3 55** C for 30 sec 

Step 4 72° C for 90 sec 

20 Step 5 Repeat steps 2-4 for an additional 29 cycles 

Step 6 72* C for 180 sec 

Step 7 4° C (and holding) 

Aliquots of the PGR reactions were run on agarose gels 

together with molecular weight markers. The sizes of the PGR 
25 products were compared to the original partial cDNAs, and 

appropriate clones were selected, ligated into plasmid and 

sequenced. 

Example 4 

In this example, the inventive method was used to obtain a 
30 novel full length cDNA from the partial sequence found in Incyte 
clone 08118 which was found to be somewhat homologous to the 
GenBank sequence of C5a . anaphylatoxin receptor, a G-protein 
coupled surface receptor from dog (Perret J et al (1995) Biochem 
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J 288:911-17). Based on the partial cDNA sequence, primers (XLR 
= GAAAGACAGCCACCACCACCACG and XLF = AGAAAGCAAGGCAGTCCATTCAGG ) 
were designed. Essentially the same method outlined in Example 3 
above was used to extend the partial sequence of 8118 to obtain 
5 the full length sequence (Seq ID NO: 12) of a novel C5a-like 

receptor homolog which is the subject of a U.S. Patent Application 
08/462,355 filed June 5, 1995, and whose disclosure is 
incorporated by reference - 

While the present invention has been described with reference 

10 to specific enzymes and sequences, particularly PGR enzyme, and 

formulations containing such, those skilled in the art understand 
that various changes may be made and equivalents may be 
substituted without departing from the true spirit- and scope of 
the invention. In addition, many modifications may be made to 

15 adapt a particular situation, material, enzyme, process, process 
step or steps and still carry out the objective, spirit and scope 
of the invention. All such modifications are intended to be 
within the scope of the claims appended hereto. 
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SEQUENCE LISTING 



(1) GENERAL. INFORMATION: 



(i> APPLICANT: INCYTE PHARMACEUTICALS, INC, 

(ii) TITLE OF INVENTION: IMPROVED METHOD FOR OBTAINING 

FULL LENGTH CDNA SEQUENCES 

(iii) NUMBER OF SEQUENCES: 12 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: INCYTE PHARMACEUTICALS, INC. 

(B) STREET; 3330 Hillview Avenue 

(C) CITY: Palo Alto 

(D) STATE: CA 

(E) COUNTRY: USA 
<F) ZIP: 94304 



(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.30 

(Vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: To Be Assigned 

(B) FILING DATE: Filed Herewith 



(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION SERIAL NO: US 08/487,112 

(B) FILING DATE: 7-JUN-X995 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION SERIAL NO: US 08/462,355 

(B) FILING DATE: 5-JUN-1995 



(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION SERIAL NO: US 08/459,046 

(B) FILING DATE: 2-JUN-1995 



(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION SERIAL NO: US 08/566,334 

(B) FILING DATE: l-DEC-1995 



(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION SERIAL NO: US 60/006,809 

(B) FILING DATE: 15-NOV-1995 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Luther, Barbara J. 

(B) REGISTRATION NUMBER: 33954 

(C) REFERENCE /DOCKET NUMBER: HP- 001-1 PCT 



(ix) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: 415-85S-0555 
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(B) TELEFAX: 415-852-0195 

(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 2543 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: GenBank HUMHSP90 

(B) CLONE: Accession No. M16660 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

CTCCGGCGCA GTGTTGGGAC TGTCTGGGTA TCGGAAAGCA AGCCTACGTT GCTCACTATT €0 

ACGTATAATC CTTTTCTTTT CAAGATGCCT GAGGT^GTGC ACCATGGAGA GGAGGAGGTG 120 

GAGACTTTTG CCTTTCAGGC AGAAATTGCC CAACTCATGT CCCTCATCAT CAATACCTTC 180 

TATTCCAACA AGGAGATTTT CCTTCGGGAG TTGATCTCTA ATGCTTCTGA TGCCTTGGAC 24 0 

AAGATTCGCT ATGAGAGCCT GACAGACCCT TCGAAGTTGG ACAGTGGTAA AGAGCTGAAA 300 

ATTGACATCA TCCCCAACCC TCAGGAACGT ACCCTGACTT TGGTAGACAC AGGCATTGGC 360 

ATGACCAAAG CTGATCTCAT AAATAATTTG GGAACCATTG CCAAGTCTGG TACTAAAGCA 420 

TTCATGGAGG CTCTTCAGGC TGGTGCAGAC ATCTCCATGA TTGGGCAGTT TGGTGTTGGC 480 

TTTTATTCTG CCTACTTGGT GGCAGAGAAA GTGGTTGTGA TCAGAAAGCA CAACGATGAT 54 0 

GAACAGTATG CTTGGGAGTC TTCTGCTGGA GGTTCCTTCA CTGTGCGTGC TGACCATGGT 600 

GAGCCCATTG GCATGGGTAC CAAAGTGATC CTCCATCTTA AAGAAGATCA GACAGAGTAC 660 

CTAGAAGAGA GGCGGGTCAA AGAAGTAGTG AAGAAGCATT CTCAGTTCAT AGGCTATCCC 720 

ATCACCCTTT ATTTGGAGAA GGAACGAGAG AAGGAAATTA GTGATGATGA GGCAGAGGAA 780 

GAGAAAGGTG AGAAAGAAGA GGAAGATAAA GATGATGAAG AAAAGCCCAA GATCGAAGAT 84 0 

GTGGGTTCAG ATGAGGAGGA TGACAGCGGT AAGGATAAGA AGAAGAAAAC TAAGAAGATC 900 

AAAGAGAAAT ACATTGATCA GGAAGAACTA AACAAGACCA AGCCTATTTG GACCAGAAAC 960 

CCTGATGACA TCACCCAAGA GGAGTATGGA GAATTCTACA AGAGCCTCAC TAATGACTGG 1020 

GT^GACCACT TGGCAGTCAA GCACTTTTCT GTAGAAGGTC AGTTGGAATT CAGGGCATTG 1080 

CTATTTATTC CTCGTCGGGC TCCCTTTGAC CTTTTTGAGA ACAAGAAGAA AAAGAACAAC 1140 

ATCAAACTCT ATGTCCGCCG TGTGTTCATC ATGGACAGCT GTGATGAGTT GATACCAGAG 1200 
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TATCTCAATT TTATCCGTGG TGTGGTTGAC TCTGAGGATC TGCCCCTGAA CATCTCCCGA 1260 

GAAATGCTCC AGCAGAGCAA AATCTTGAAA GTCATTCGCA AA;\ACATTGT TAAGAAGTGC 1320 

CTTGAGCTCT TCTCTGAGCT GGCAGAAGAC AAGGAGAATT ACAAGAAATT CTATGAGGCA 1380 

TTCTCTAAAA ATCTCAAGCT TGGAATCCAC GAAGACTCCA CTAACCGCCG CCGCCTGTCT 1440 

GAGCTGCTGC GCTATCATAC CTCCCAGTCT GGAGATGAGA TGACATCTCT GTCAGAGTAT 1500 

GTTTCTCGCA TGAAGGAGAC ACAGAAGTCC ATCTATTACA TCACTGGTGA GAGCAAAGAG 1560 

CAGGTGGCCA ACTCAGCTTT TGTGGAGCGA GTGCGGAAAC GGGGCTTCGA GGTGGTATAT 1620 

ATGACCGAGC CCATTGACGA GTACTGTGTG CAGCAGCTCA AGGAATTTGA TGGGAAGAGC 1680 

CTGGTCTCAG TTACCAAGGA GGGTCTGGAG CTGCCTGAGG ATGAGGAGGA GAAGAAGAAG 1740 

ATGGAAGAGA GCAAGGCAAA GTTTGAGAAC CTCTGCAAGC TCATGAAAGA AATCTTAGAT 1800 

AAGAAGGTTG AGAAGGTGAC AATCTCCAAT AGACTTGTGT CTTCACCTTG CTGCATTGTG 1860 

ACCAGCACCT ACGGCTGGAC AGCCAATATG GAGCGGATCA TGAAAGCCCA GGCACTTCGG 1920 

GACAACTCCA CCATGGGCTA TATGATGGCC AAAAAGCACC TGGAGATCAA CCCTGACCAC 1980 

CCCATTGTGG AGACGCTGCG GCAGAAGGCT GAGGCCGACA AGAATGATAA GGCAGTTAAG 2040 

GACCTGGTGG TGCTGCTGTT TGAAACCGCC CTGCTATCTT CTGGCTTTTC CCTTGAGGAT 2100 

CCCCAGACCC ACTCCAACCG CATCTATCGC ATGATCAAGC TAGGTCTAGG TATTGATGAA 2160 

GATGAAGTGG CAGCAGAGGA ACCCAATGCT GCAGTTCCTG ATGAGATCCC CCCTCTCGAG 2220 

GGCGATGAGG ATGCGTCTCG CATGGAAGAA GTCGATTAGG TTAGGAGTTC ATAGTTGGAA 2280 

AACTTGTGCC CTTGTATAGT GTCCCCATGG GCTCCCACTG CAGCCTCGAG TGCCCCTGTC 2340 

CCACCTGGCT CCCCCTGCTG GTGTCTAGTG TTTTTTTCCC TCTCCTGTCC TTGTGTTGAA 2400 

GGCAGTAAAC TAAGGGTGTC AAGCCCCATT CCCTCTCTAC TCTTGACAGC AGGATTGGAT 2460 

GTTGTGTATT GTGGTTTATT TTATTTTCTT CATTTTGTTC TGAAATTAAA GTATGCAAAA 2520 

TAAAGAATAT GCCGTTTTTA TAG 2543 
(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 261 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(vii) 



IMMEDIATE SOtJRCE : 

(A) LIBRARY: THP-1 

(B) CLONE: 142 01 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



AAGAAfiAAGA ACAACATCAA ACTCTATGTC CGCCGTGTGT TCATCATGGC AGCTGTGATG 



60 



AGTTGATACC AGAGTATCTC AATTTTATCC GTGGTGTGGT TGACTTGAGG TCTGCCCCTG 



120 



AACATCTCCC GGAAATGCTC CAGCAGAGCA AAATCTTGAA AGGCATTCGC AAAAACATTG 



180 



TTAAGAGTGC CTTAGCTCTT CTCTAGCTGG CAGAAGCAAG GGGATTTCJiA GAAATTCTTT 



240 



TGGGGGGATT TCTTAAAAAT T 261 
(2) INFORMATION FOR SEQ ID N0:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 478 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: THP-1 

(B) CLONE: 14201.3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GCTGGGTATC GGAAAGCAAG CCTACGTTGC TCACTATTAC GTATAATCCT TTTCTTCAAG 60 

ATGCCTGAGG AAGTGCACCA TGGAGAGGAG GAGGTGGAGA CTTTTGCCTT TCAGGCAGAA 120 

ATTGCCCAAC TCATGTCCCT CATCATCAAT ACCTCCTATT CCAACAAGGA GATTTCCTCG 180 

GGAGTTGATC TCTAATGCTT CTGATGCCTC GGACAAGATT CGCTATGAAG CCTGACAGAC 240 

CCTTCGAAGT GGTCAGCGGC AAGAGCTGAA AATTGACATC ATCCCCAACC CTCAGGAACG 300 

TCCCTGTACT TTGGGTAGAC ACAGGCATTG GCATAAACAA AGCTGACCTC ATATTATTCG 360 

GGGAACCATT GCCAAGTCTT GTCTAAAAGC ATTCATGGAG GCTCTCAGGT TGGCGCAGAC 420 

ATCTCCAGAT TGGCAGGTGG GTGTTGGCTT TATTCTGCCC ACTTGGTGGC AGAGAAAT 478 
(2) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: SOS base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: THP-1 

(B) CLONE: 14201.5 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

GTTGGGACTG TCTGGGTATC GGAAAGCAAG CCTACGTTGC TCACTATTAC GTATAATCCT 60 

TTTCTTTTCA AGATGCCTGA GGAAGTGCAC CATGGAGAGG AGGAGGTGGA GACTTTTGCC 120 

TTTCAGGCAG AAATTGCCCA ACTCATGTCC CTCATCATCA ATACCTCCTA TTCCAACAAG 180 

GAGATTTTCC TTCGGGAGTT GATCTCTAAT GCTTCTGATG CCTTGGACAA GATTCGCTAT 240 

GAGAGCCTGA CAGACCCTTC GAAGTTGGAC AGTGGTAAAG AGCTGAAAAT TGACATCATC 300 

CCCAACCCTC AGGAACGTAC CCTGACTTTG GGTAGACACA GGCATCGGCA TGACCAAAAG 360 

CTGATCTCAT AATAATTGGG AACCATTGCA AGTCTGGTAC TAAAGCATTC ATGGAGGCTC 420 

TTCAGGCTGG TGCAGACATC TCCATGATTG GGCAGCTTGG GTGTTGCTTT ATTCTGCCTC 480 

CTTGGTGGCA GAGAAAGTGT TGTGATCA 508 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 547 base pairs 
<B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: THP-1 

(B) CLONE: 14201.13 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

TTGAGAGTAT GTCGAGTTAC TGTGGAGGTT CCTTCACTGC GTGCTGACAT GGTGAGCCCA 60 

TGGGAGCGGT ACCAAGTGAT CCTCCIATCTC AAAGAAGATC AGACAGAGTA CCTAGAGAGA 120 

GGCGGATCAA AGAGTAGTGA TGAGCATCCT CAGATCIATAG GCTATCCCAT CACCCTTTTT 180 

TGGAGAAGGA CGAGAGAAGG AATTAGGATG ATGAGGCAGA GGAAGAGAAT GGTGAGAATG 240 

AAGAGGAGTA ACGATGATGA AGAAACCCCA AGATCGATGA TGTGGTTCAG ATGAGGGGAT 3 00 

GACAGCGGTA GATTU^GAAGA AGAAACTAGA ATCATCGGAT CATGACAGGA AGAACTAACA 360 

GATCATCTTT CGGCCAGAAT CCCTGATGTC ATCACCCAAG AGGGTATGGA GATTTCTACA 420 

TGCAGCTCAC TTTACTGGGC AAGACACTTG GCAGCAACAC TTTT C T GTAG AAGGCCATTG 4 80 
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CATCACGCAT TGCTATTCTT CCCTCGCCGT CTCCTTTGAC CTGGTCTGGC ATCATGGTGT 54 0 

CTTGATC 54 7 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1996 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: GenBank HUMCATHB 

(B) CLONE: Accession No. L16510 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

TCCGGCAACG CCAACCGCTC CGCTGCGCGC AGGCTGGGCT GCAGGCTCTC GGCTGCAGCG 60 

CTGGGCTGGT GTGCAGTGGT GCGACCACGG CTCACGGCAG CCTCAGCCAC CCAGATGTAA 120 

GCGATCTGGT TCCCACCTCA GCCTCCCGAG TAGTGGATCT AGGATCCGGC TTCCAACATG 180 

TGGCAGCTCT GGGCCTCCCT CTGCTGCCTG CTGGTGTTGG CCAATGCCCG GAGCAGGCCC 240 

TCTTTCCATC CCCTGTCGGA TGAGCTGGTC AACTATGTCA ACAAACGGAA TACCACGTGG 3 00 

CAGGCCGGGC ACAACTTCTA CAACGTGGAC ATGAGCTACT TGAAGAGGCT ATGTGGTACC 360 

TTCCTGGGTG GGCCCAAGCC ACCCCAGAGA GTTATGTTTA CCGAGGACCT GAAGCTGCCT 420 

GCAAGCTTCG ATGCACGGGA ACAATGGCCA CAGTGTCCCA CCATCAAAGA GATCAGAGAC 4 80 

CAGGGCTCCT GTGGCTCCTG CTGGGCCTTC GGGGCTGTGG AAGCCATCTC TGACCGGATC 54 0 

TGCATCCACA CCAATGCGCA CGTCAGCGTG GAGGTGTCGG CGGAGGACCT GCTCACATGC 600 

TGTGGCAGCA TGTGTGGGGA CGGCTGTAAT GGTGGCTATC CTGCTGAAGC TTGGAACTTC 660 

TGGACAAGAA AAGGCCTGGT TTCTGGTGGC CTCTATGAAT CCCATGTAGG GTGCAGACCG 720 

TACTCCATCC CTCCCTGTGA GCACCACGTC AACGGCTCCC GGCCCCCATG CACGGGGGAG 780 

GGAGATACCC CCAAGTGTAG CAAGATCTGT GAGCCTGGCT ACAGCCCGAC CTACAAACAG 840 

GACAAGCACT ACGGATACAA TTCCTACAGC GTCTCCAATA GCGAGAAGGA CATCATGGCC 900 

GAGATCTACA AAAACGGCCC CGTGGAGGGA GCTTTCTCTG TGTATTCGGA CTTCCTGCTC 960 

TACAAGTCAG GAGTGTACCA ACACGTCACC GGAGAGATGA TGGGTGGCCA TGCCATCCGC 1020 

ATCCTGGGCT GGGGAGTGGA GAATGGCACA CCCTACTGGC TGGTTGCCAA CTCCTGGAAC 1080 

ACTGACTGGG GTGACAATGG CTTCTTTAAA ATACTCAGAG GACAGGATCA CTGTGGAATC 1140 
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GAATCAGAAG TGGTGGCTGG AATTCCACGC ACCGATCAGT ACTGGGAAAA GATCTAATCT 12 00 

GCCGTGGGCC TGTCGTGCCA GTCCTGGGGG CGAGATCGGG GTAGAAATGC ATTTTATTCT 1260 

TTAAGTTCAC GTAAGATACA AGTTTCAGGC AGGGTCTGAA GGACTGGATT GGCCAAACAT 1320 

CAGACCTGTC TTCCAAGGAG ACCAAGTCCT GGCTACATCC CAGCCTGTGG TTACAGTGCA 1380 

GACAGGCCAT GTGAGCCACC GCTGCCAGCA CAGAGCGTCC TTCCCCCTGT AGACTAGTGC 1440 

CGTGGGAGTA CCTGCTGCCC AGCTGCTGTG GCCCCCTCCG TGATCCATCC ATCTCCAGGG 1500 

AGCAAGACAG AGACGCAGGA TGGAAAGCGG AGTTCCTAAC AGGATGAAAG TTCCCCCATC 1560 

AGTTCCCCCA GTACCTCCAA GCAAGTAGCT TTCCACATTT GTCACAGAAA TCAGAGGAGA 1620 

GATGGTGTTG GGAGCCCTTT GGAGAACGCC AGTCTCCAGG TCCCCCTGCA TCTATCGAGT 1680 

TTGCAATGTC ACAACCTCTC TGATCTTGTG CTCAGCATGA TTCTTTAATA GAAGTTTTAT 1740 

TTTTCGTGCA CTCTGCTAAT CATGTGGGTG AGCCAGTGGA ACAGCGGGAG CCTGTGCTGG 1800 

TTTGCAGATT GCCTCCTAAT GACGCGGCTC AAAAGGAAAC CAAGTGGTCA GGAGTTGTTT I860 

CTGACCCACT GATCTCTACT ACCACAAGGA AAATAGTTTA GGAGAAACCA GCTTTTACTG 1920 

TTTTTGAAAA ATTACAGCTT CACCCTGTCA AGTTAACAAG GAATGCCTGT GCCAATAAAA 1980 

GGTTTCTCCA ACTTGA 1996 
(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 294 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LIVER 

(B) CLONE: 87058 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

CGGCACGAGC CAACTCCTGG AACACTGACT GGGGTGACAA TGGCTTCTTT AAAATACTCA 60 

GAGGACAGGT TCACTGTGGA ATCGAATCAG AAGTGGTGGC TGGAATTCCA CGCACCGTTC 120 

AGTACTGGGA AAAGTCTAAT CTGCCGTGGG CCTTCGTGCC AGTCCTGGGG GCGAGATGGG 180 

GGTAGT^TG CATTTTATTC TTTAAGTTCA CGTAAGATAC AAGTTTCAGA CAGGGGTCTA 240 

AGGCCTGGTT GCCAAAATCA GACCTGTTTT TCAAGGGGCC CAAGTCCTGG GTTC 2 94 
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(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 552 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Liver 

(B) CLONE: 87058.6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GTGAAGCTTG GAACTTCTGG ACAAGAAAAG GCCTGGTTTC TGGTGGCCTC TATGAATCCC 60 

ATGTAGGGTG CAGACCGTAC TCCATCCCTC CCTGTGAGCA CCACGTCAAC GGCTCCCGGC 120 

CCCCATGCAC GGGGGAGGGA GATACCCCCA AiSTGTAGCAA GATCTGTGAG CCTGGCTACA 180 

GCCCGACCTA CAAACAGGAC AAGCACTACG GATACAATTC CTACAGCGTC TCCAATAGCG 240 

AGAAGGACAT CATGGCCGAG ATCTACAAAA ACGGCCCCGT GGAGGGAGCT TTCTCTGTGT 300 

ATTCGGACTT CCTGCTCTAC AAGTCAGGAG TGTACCAACA CGTCACCGGA GAGATGATGG 360 

GTGGCCATGC CATCCGCATC CTGGGCTGGG GAGTGGAGAA TGGCACAACC TACTGGCTGG 420 

TTGGCAACTC CTGGAACACT GACTGGGGTG ACAATGGGTT CACTGTGGAA TCGAATCAGA 480 

MTGGTGGTG GAATTCCACG CACGATCAAG TGCTGGGAAA AGATCTTAAT CTGCCGGGGC 540 

TGTCGGCCAG TC 552 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 559 base pairs 

(B) TYPE: nucleic acid 

(C) . STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Liver 

(B) CLONE: 87058.8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 
GAGGTACCTT CCTGGGTGGG CCCAAGCCAC CCCAGAGAGT TATGTTTACC GAGGACCTGA 60 
AGCTGCCTGC AAGCTTCGAT GCACGGGAAC AATGGCCACA GTGTCCCACC ATCAAAGAGA 120 
TCAGAGACCA GGGTCCTGTG GCTCCTGCTG GGCCTTCGGG GCTGTGGAAG CCATCTCTGA 180 
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CCGGATCTGA TCCACACCAA TGCGCACGTC AGCGTGGAGG TGTCGGCGGA GGACTGCTCA 240 

CATGCTGTGG CAGATGTGTG GGGACGGCTG TAATGGTGGC TATCCTGCTG AAGCTTGGAC 3 00 

TTCTGGACAA GAAAAGGCCC TGGTTTCTGG TGGCCTCTAT GATCCCATGT AGGGTGTAGA 360 

CCGTACTCCA TCCCTCCCTG TGAAGCACCA CGTCAACGGT TCCCGGGCCC CATGCACGGG 420 

GAGGGAGATA CCCCCAAGTG TAACAAGATC TGTGAGCCTG GGTACAGTCC CGACCACAAA 480 

CAGGAAAAGC ACTACGGATA CAATTCCTCA GGTCTCCAAT AGTGAGAAGG GACATCATGC 540 

CGAGATCTAC AATAACGGC 559 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQXJENCE CHARACTERISTICS: 

(A) LENGTH: 622 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Liver 

(B) CLONE: 87058.16 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

CGGTTGAGAT TCGGACAGTC CGAAAACGTC CGGCAAGTCA CCCGCTCCGC TGGCGCAGGC 60 

TGGGTGCAGG CTCTCGGTGC AGGCTGGGTG GATCTAGGAT CCGGCTTCCA ACATGTGGCA 120 

GTTCTGGGCC TCCCTCTGTG CCTGCTGGTG TTGGACAATG CCCGGAGGAG GCCTCTTTCC 180 

ATCCCCTGTC GGATGAGCTG GTCACTATGT CAACAAACGG AATACCACGT GGAGGCCGGG 240 

AACAACTTCT ACAACGTGGA CATGAGCTAC TTGAGAGGTA TGTGGTACCT TCCTGGGTGG 300 

GCCCAAGCCA CCCCAGAGAG TTTGTTTACC GAGGACCTGA GCTGCCTGCA AGCTTCGAAG 360 

GACGGGAACA ATGGCCACAG TGTCCCACCA TCAAAGAGAT CAGAGACAGG GCTCCTGTGG 420 

TCCTGCTGGG CCTCCGGGGC TGTGGAAGCA TCTCTGACCG GATCTGCATC CACACCAATG 480 

GCACGTCAGC GTGGTGGTGT CGGGGAGGAC CTGATCACCT TTGTGGTAGC ATGTGTGGGG 540 

GACGGCTGTA ATGGTGGTTA TCCTGTGAAG CTGGGCCTTC TAGAAAGAAA AGGCTGTTTT 600 

GGTGGCCTTA TGACTCCCAT GT 622 

(2) -INFORMATION FOR SEQ ID NO: 11: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 984 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOIiECUIiE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Placenta 

(B) CLONE: 179696 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

ATGGAATGGG ACAATGGCAC AGACCAGGCT CTGGGCTTGC CACCCACCAC CTGTGTCTAC 60 

CGCGAGAACT TCAAGCAACT GCTGCTCCCA CCTGTGTATT CGGCGGTGCT GGCGCCTGCC 120 

CTCCCGCTGA ACATCTGTGT CATTACCCAG ATCTGCACGT CCCGCCGGGC CCTGACCCGC 180 

ACGGCCGTGT ACACCCTAAA CCTTGCTCTG CCTGACCTGC TATATGCCTG CTCCCTGCCC 240 

CTGCTCATCT ACAACTATGC CCAAGGTGAT CACTGGCCCT TTGGCGACTT CGCCTGCCGC 300 

CTGGTCCGCT TCCTCTTCTA TGCCAACCTG CACGGGAGGA TCCTCTTCCT CACCTGCATC 360 

AGCTTCCAGC GCTACCTGGG CATCTGCCAC CCGCTGGCCC CCTGGCACAA ACGTGGGGGC 420 

CGCCGGGCTG CCTGGCTAGT GTGTGTAGCC GTGTGGCTGG CCGTGACAAC CCAGTGCCTG 480 

CCCACAGCCA TCTTCGCTGC CACAGGCATC CAGCGTAACC GCACTGTCTG TTATGACCTC 540 

AGCCCGCCTG CCCTGGCCAC CCACTATATG CCCTATGGGA TGGCTCTCAC TGTCATCGGC 600 

TTCCTGCTGC CCTTTGCTGC CCTGCTGGCC TGCTACTGTC TCCTGGCCTG CCGCCTGTGC 660 

CGCCAGGATG GCCCGGCAGA GCCTGTGGCC CAGGAGCGGC GTGGCAAGGC GGCCCGCATG 72 0 

GCCGTGGTGG TGGCTGCTGT CTTTGGCATC AGCTTCCTGC CTTTTCACAT CACCAAGACA 78 0 

GCCTACCTGG CAGTGCGCTC GACGCCGGGC GTCCCCTGCA CTGTATTGGA GGCCTTTGCA 840 

GCGGCCTACA AAGGCACGCG GCCGTTTGCC AGTGCCAACA GCGTGCTGGA CCCCATCCTC 900 

TTCTACTTCA CCCAGAAGAA GTTCCGCCGG CGACCACATG AGCTCCTACA GAAACTCACA 960 

GACAAATGGC AGAGGCAGGG TCGC 984 
(2) INFORMATION FOR SEQ ID NO : 12 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1446 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(vii) IMMEDIATE SOURCE: 
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(A) LIBRARY: Mast Cell 

(B) CLONE: 8118 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

ATGGCGTCTT TCTCTGCTGA GACCAATTCA ACTGACCTAC TCTCACAGCC ATGGAATGAG 60 

CCCCCAGTAA TTCTCTCCAT GGTCATTCTC AGCCTTACTT TTTTACTGGG ATTGCCAGGC 120 

AATGGGCTGG TGCTGTGGGT GGCTGGCCTG AAGATGCAGC GGACAGTGAA CACAATTTGG 180 

TTCCTCCACC TCACCTTGGC GGACCTCCTC TGCTGCCTCT CCTTGGCCTT CTCGCTGGCT 240 

CACTTGGCTC TCCAGGGACA GTGGCCCTAC GGCAGGTTCC TATGCAAGCT CATCCCCTCC 300 

ATCATTGTCC TCAACATGTT TGGCAGTGTC TTCCTGCTTA CTGCCATTAG CCTGGATCGC 360 

TGTCTTGTGG TATTCAAGCC AATCTGGTGT CAGAATCATC GCAATGTAGG GATGGCCTGC 420 

TCTATCTGTG GATGTATCTG GGTGGTGGCT TTTGTGTTGT GCATTCCTGT GTTCGTGTAC 480 

CGGGAAATCT TCACTACAGA CAACCATAAT AGATGTGGCT ACAAATTTGG TCTCTCCAGC 540 

TCATTAGATT ATCCAGACTT TTATGGGGAT CCACTAGAAA ACAGGTCTCT TGAAAACATT 600 

GTTCAGCCGC CTGGAGAAAT GAATGATAGG TTAGATCCTT CCTCTTTCCA AACAAATGAT 660 

CATCCTTGGA CAGTCCCCAC TGTCTTCCAA CCTCAAACAT TTCAAAGACC TTCTGCAGAT 720 

TCACTCCCTA GGGGTTCTGC TAGGTTAACA AGTCAAAATC TGTATTCTAA TGTATTTAAA 780 

CCTGCTGATG TGGTCTCACC TAAAATCCCC AGTGGGTTTC CTATTGAAGA TCACGAAACC 840 

AGCCCACTGG ATAACTCTGA TGCTTTTCTC TCTACTCATT TAAAGCTGTT CCCTAGCGCT 900 

TCTAGCAATT CCTTCTACGA GTCTGAGCTA CCACAAGGTT TCCAGGATTA TTACAATTTA 960 

GGCCAATTCA CAGATGACGA TCAAGTGCCA ACACCCCTCG TGGCAATAAC GATCACTAGG 1020 

CTAGTGGTGG GTTTCCTGCT GCCCTCTGTT ATCATGATAG CCTGTTACAG CTTCATTGTC 1080 

TTCCGAATGC AAAGGGGCCG CTTCGCCAAG TCTCAGAGCA AAACCTTTCG AGTGGCCGTG 1140 

GTGGTGGTGG CTGTCTTTCT TGTCTGCTGG ACTCCATACC ACATTTGGGG AGTCCTGTCA 1200 

TTGCTTACTG ACCCAGAAAC TCCCTTGGGG AAAACTCTGA TGTCCTGGGA TCATGTATGC 1260 

ATTGCTCTAG CATCTGCCAA TAGTTGCTTT AATCCCTTCC TTTATGCCCT CTTGGGGAAA 1320 

GATTTTAGGA AGAAAGCAAG GCAGTCCATT CAGGGAATTC TGGAGGCAGC CTTCAGTGAG 1380 

GAGCTCACAC GTTCCACCCA CTGTCCCTCA AACAATGTCA TTTCAGAAAG AAATAGTACA 1440 

ACTGTG 1446 
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CLAIMS 

1. A method of extending the sequence of a partial complementary 
DNA (cDNA) using polymerase chain reaction (PGR) , comprising the 
steps of; 

5 a) combining a first and second PGR primer with nucleic acid 

from a cDNA library expected to contain said partial cDNA, or a 
genomic library, under conditions suitable for synthesis of 
nucleic acid PGR products from the first and second primers, 
wherein said first and second primers are capable of annealing to 

10 opposite strands of the partial cDNA or genomic DNA and initiating 
nucleic acid synthesis in an outward manner and wherein the first 
primer is capable of being extended by DNA polymerase in an 
antisense direction and the second primer is capable of being 
extended in a sense direction . 

15 b) purifying the PGR products, and 

c) identifying extended nucleotide secjuences derived from 
said partial cDNA or said genomic DNA. 

2. The method of Glaim 1 wherein identifying extended sequences 
comprises nucleic acid sequencing. 

20 3, The method of Claim 2 further comprising extending the 

nucleotide sequences of step 6c by repeating steps 6a through 
6c on the nucleotide sequences identified in step 6c. 
4. A method of extending the nucleotide sequence of a partial 
complementary DNA (cDNA) using polymerase chain reaction 
25 (PGR), comprising the steps of: 

a) combining a first and second PGR primer with nucleic acid 
from a cDNA library expected to contain said partial cDNA, or a 
genomic library, under conditions suitable for synthesis of 
nucleic acid PGR products from the first and second primers, 
30 wherein said first and second primers are capable of annealing to 
opposite strands of the partial cDNA or genomic DNA and initiating 
nucleic acid synthesis in an outward manner and wherein the first 
primer is capable of being extended by DNA polymerase in an 
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antisense direction and the second primer is capable of being 
extended in a sense direction. 

b) purifying the PGR products, 

c) ligating the purified PGR products under conditions 
5 suitable for the formation of circular closed nucleic acid, 

d) transforming a host cell with the circular closed nucleic 
acid and culturing the transformed host cell under conditions 
suitable for growth, 

e) recovering said circular closed nucleic acid from the 
10 cultured, transformed host cell, 

f ) identifying extended nucleotide sequences derived from 
said partial cDNA or said genomic DNA. 

5. The method of Glaim 4 wherein identifying extended sequences 

comprises nucleic acid sequencing. 
15 6. The method of Glaim 4 wherein culturing the transformed host 
cell under conditions suitable for growth comrpises culturing 
in the presence of selective antibiotic conditions . 

7. The method of Glaim 4 wherein said host cell is E. coli . 

8. The method of Glaim 4 wherein after step 4b and prior to step 
20 4c, the purified PGR products are treated under conditions 

sutiable for converting nucleic acid overhangs to blunt ends. 
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Step 1 Partial cDNA sequence from public database or a researcher * s 
earlier . .efforts 

Step 2 Two primers (XLR/XLS) designed based on partial sequence 

Step 3 Amplification of piasmids containing the gene of interest 

Step 4 Purification of Ihe amplified DNA fragments 

Step 5 Religation of Ihe amplified DNA fragments to circular closed DNA 

( 

Step 6 Transformation of the circular closed DNA into E.coli cells 



Step 7 GrovA^h of individual clones in liquid media under appropriate 
selection (e.g. Carb) 



Step 8 PCR-screening of the individual clones for difierenl insert sizes 
upstream of the XLR-priming site. 

\ 

Step 9 Selection of clones for sequence analysis 
Step 10 Sequencing of clones of interest 



FIGURE 1 
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cDNA 
insert 




cDNA insert 



plasmid vector 



primers 



Products of XL-PC R reaction 
see figure 4 
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cDNA insert 
plasmid vector 
primers 



FIGURE 4 



BNSDOCID: <WO 9638591 A1J_> 



wo 96/38591 PCTAJS96/08501 

5/20 




The purified DNA segments 
are religated and fonm a 
circular plasmid 




FIGURE 5 
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cDNA insert 
plasmid vector 
primers 



T3 T7. 



I 



Step 1 purification 
Step 2 religation 



Step 3 transformation and 
growth 



Step 4 PCR-screening 



FIGURE 6 
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10 20 30 40 50 

HSD 90 1 CTCCGGCGCA GTGTTGGGAC TGTCTGGGTA TCGGAAAGCA AGCCTACGTT 50 

14*201 1 ^0 

14201.3 1 gCTGGGTA TCGGAAAGCA AGCCTACGTT 50 

14201 is 1 GTTGGGAC TGTCTGGGTA TCGGAAAGCA AGCCTACGTT SO 

14201.13 1 ' 

60 70 80 90 100 

Hsp 90 51 GCTCACTATT ACGTATAATC CTTTTCTTTT CAAGATGCCT GAGGAAGTGC . 100 

14201 51 

14201.3 51 GCTCACTATT ACGTATAATC CTTTTCTNTN CAAGATGCCT GAGGAAGTGC 100 
14201.5 51 GCTCACTATT ACGTATAATC CTTTTCTTTT CAAGATGCCT GAGGAAGTGC 100 
14201.13 51 

110 120 130 140 150 

Hsp 90 101 ACCATGGAGA GGAGGAGGTG GAGACTTTTG CCTTTCAGGC AGAAATTGCC 150 

14201 101 ^-5° 

14201.3 101 ACCATGGAGA GGAGGAGGTG GAGACTTTTG CCTTTCAGGC AGAAATTGCC 150 

14201.5 101 ACCATGGAGA GGAGGAGGTG GAGACTTTTG CCTTTCAGGC AGAAATTGCC 150 

14201.13 101 

160 170 180 190 200 

Hsp 90 151 CAACTCATGT CCCTCATCAT CAATACCTTC TATTCCAACA AGG.AGATTTT 200 

14201 151 

14201.3 151 CAACTCATGT CCCTCATCAT CAATACCTCC TATTCCAACA AGGAGATTNT 200 
14201.5 151 CAACTCATGT CCCTCATCAT CAATACCTCC TATTCCAACA AGGAGATTTT 200 
14201.13 151 ^ 

210 220 230 240 250 

Hso 90 201 CCTTCGGGAG TTGATCTCTA ATGCTTCTGA TGCCTTGGAC AAGATTCGCT 250 

14201 201 

14201.3 201 CCTNCGGGAG TTGATCTCTA ATGCTTCTGA TGCCTCGGAC AAGATTCGCT 250 
14201.5 201 CCTTCGGGAG TTGATCTCTA ATGCTTCTGA TGCCTTGGAC AAGATTCGCT 250 
14201.13 201 

260 270 280 290 300 

Hsp 90 251 ATGAGAGCCT GACAGACCCT TCGAAGTTGG ACAGTGGTAA AGAGCTGAAA 300 

14201. 251 

14201.3 251 ATGANAGCCT GACAGACCCT TCGAAGTNGG TCAGCGGCAA NGAGCTGAAA 300 
14201.5 251 ATGAGAGCCT GACAGACCCT TCGAAGTTGG ACAGTGGTAA AGAGCTGAAA 300 
14201.13 251 
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310 320 330 340 350 

Hsp 90 301 ATTGACATCA TCCCCA^VCCC TCAGGAACGT ACCCTGACTT TGGTAGACAC 350 

14201 301 350 

14201.3 301 ATTGACATCA TCCCCAACCC TCAGGAACGT NCCCTGACTT TGGTAGACAC 350 

14201.5 301 ATTGACATCA TCCCCAACCC TCAGGAACGT ACCCTGACTT TGGTAGACAC 350 

14201.13 301 350 

360 370 380 390 400 

Hsp 90 351 AGGCATTGGC ATGACCAAAG CTGATCTCAT AAaTAATTtG GGAACCATTG 400 

14201 351 400 

14201.3 351 AGGCATTGGC ATGAaacAAG CTGAcCTCAT NAnTTATTcG GGgAaCcaTt 400 

14201.5 351 AGGCATcGGC ATGACCAAAG CTGATCTCAT AAnTAATTnG GGAACCATTG 400 

14201.13 351 400 

410 420 430 440 450 

Hsp 90 401 CCAAGTCTGG TACTAAAGCA TTCATGGAGG CTCTTCAGGC TGGTGCAGAC 450 

14201 401 450 

14201.3 401 CCAAGTCTTG TNCTAAAGCA TTCATGGAGG CTCTNCAGGN TGGcGCAGAC 450 

14201.5 401 NCAAGTCTGG TACTAAAGCA TTCATGGAGG CTCTTCAGGC TGGTGCAGAC 450 

14201.13 401 ^ 450 

460 470 480 490 500 

Hsp 90 451 ATCTCCATGA TTGGGCAGTT tGGTGTTGGC TttTATTCTG CCTACTTGGT 500 

14201 451 ' 500 

14201.3 4 51 ATCTCCANGA TTNGGCAGNT GGGTGTTGGC TTnTATTCTG CCcACTTGGT 500 

14201.5 451 ATCTCCATGA TTGGGCAGTT GGGTGTTGNC TTnTATTCTG CCTcCTTGGT 500 

14201.13 451 500 

510 520 530 540 550 

Hsp 90 501 GGCAGAGAAA GTGGTTGTGA TCAGAAAGCA CAACGATGAT GAacAGTATG 550 

14201 501 550 

14201.3 501 GGCAGAGAAA NNT 550 

14201.5 501 GGCAGAGAAA GTNGTTGTGA TCA 550 

14201.13 501 TT GAgnAGTATG 550 

560 570 580 590 600 

Hsp 90 551 CTtgGgAGTc TtCTGcTGGA GGTTCCTTCA CTgtGCGTGC TGACcATGGT 600 

14201 . 551 600 

14201.3 551 600 

14201.5 551 600 

14201.13 551 -TcnGnAGT- TaCTGnTGGA GGTTCCTTCA CTnnGCGTGC TGAC-ATGGT 600 

610 620 630 640 650 

Hsp 90 601 GAGCCCATtG GcAtgGGTAC CAaAGTGATC CTCCATCTtA AAGAAGATCA 650 

14201 601 650 

14201.3 601 650 

14201.5 601 650 

14201.13 601 GAGCCCATnG GgAggGGTAC CAnAGTGATC CTCCATCTcA AAGAAGATCA 650 
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660 670 680 690 700 

Hsp 90 631 GACAGAGTAC CTAGAaGAGA GGCGGgTCA.z^ AGaAGTAGTG AaGAaGCATT 700 

14201 651 700 

14201.3 651 700 

14201,5 651 700 

14201,13 651 GACAGAGTAC CTAGAnGAGA GGCGGaTCAA AGai^GTAGTG AtGAnGCATc 700 

710 720 730 740 750 

Hsp 90 701 CTCAGtTCAT AGGCTATCCC ATCACCCTTT aTTTGGAGAA GGaACGAGAG 750 

14201 701 ' 750 

14201,3 701 750 

14201.5 701 750 

14201.13 701 CTCAGaTCAT AGGCTATCCC ATCACCCTTT nTTTGGAGAA GGnACGAGAG 750 

760 770 780 790 800 

Hsp 90 751 AAGGAaATTA GtGATGATGA GGCAGAGGAA GAGAAaGGTG AGAAaGAAGA 800 

14201 751 800 

14201.3 751 800 

14201.5 751 800 

14201.13 751 AAGGAnATTA GnGATGATGA GGCAGAGGAA GAGAAtGGTG AGAAtGAAGA 800 

810 820 830 840 850 

Hsp 90 801 GGAaGaTAAa GATGATGAAG AAAagCCCA.^ GATCGAaGAT GTGGgTTCAG 850 

14201 801 850 

14201.3 801 850 

14201,5 801 850 

14201.13 801 GGAnGnTAAc GATGATGAAG AAAncCCCAA GATCGAtGAT GTGGnTTCAG 850 

860 870 880 890 900 

Hsp 90 851 ATGAGGaGGA TGACAGCGGT aAgGATAAGA AGAAGAAaAC TAaGAagATC 900 

14201 851 900 

14201.3 851 900 

14201.5 851 900 

14201.13 851 ATGAGGnGGA TGACAGCGGT nAnGATAAGA AGAAGAAnAC TAnGAnnATC 900 

910 920 930 940 950 

Hsp 90 901 AAAGAGAAAT ACATTGATCA GGAAGAACTA AACAAGACCA AGCCTATTTG 950 

14201 901 ' 950 

14201,3 901 950 

14201.5 901 950 

14201.13 901 950 

960 970 980 990 1000 

Hsp 90 951 GACCAGAAAC CCTGATGACA TCACCCAAGA GGAGTATGGA GAATTCTACA 1000 

14201 951 -: ' 1000 

14201.3 951 1000 

14201:5 951 3.000 

14201.13 951 1000 
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Hso 90 

14201 

14201.3 

14201.5 

14201.13 



Hsp 90 

14201 

14201.3 

14201.5 

14201.13 



Hso 90 

14201 

14201.3 

14201.5 

14201.13 



1010 1020 1030 1040 1050 

1001 AGAGCCTCAC TAATGACTGG GAAGACCACT TGGCAGTCAA GCACTTTTCT 

1001 

1001 

1001 

1001 [ ] [ [ ] [ 

1060 1070 1080 1090 1100 

1051 GTAGAAGGTC AGTTGGAATT CAGGGCATTG CTATTTATTC CTCGTCGGGC 

1051 

1051 

1051 

1051 

1110 1120 1130 1140 1150 

1101 TCCCTTTGAC CTTTTTGAGA ACAAGAAGA.A. AAAGAACAAC ATCAAACTCT 

1101 AAGAA AAAGAACAAC ATCAAACTCT 

1101 ; 

1101 

1101 [ 



1050 
1050 
1050 
1050 
1050 

1100 
1100 
1100 
1100 
1100 



1150 
1150 
1150 
1150 
1150 



1160 1170 1180 1190 1200 

Hsp 90 1151,,ATGTCCGCCG TGTGTTCATC ATGGaCAGCT GTGATGAGTT GATACCAGAG 1200 

14201 1151 ATGTCCGCCG TGTGTTCATC ATGGnCAGCT GTGATGAGTT GATACCAGAG 1200 

14201.3 1151 1200 

14201.5 1151 1200 

14201.13 1151 1200 



1210 1220 1230 1240 1250 

Hsp 90 1201 TATCTCAATT TTATCCGTGG TGTGGTTGAC TcTGAGGaTC TGCCCCTGAA 1250 

14201 1201 TATCTCAATT TTATCCGTGG TGTGGTTGAC TnTGAGGnTC TGCCCCTGAA 1250 

14201.3 1201 1250 

14201.5 1201 1250 

14201.13 1201 1250 

1260 1270 1280 1290 • 1300 

Hsp 90 1251 CATCTCCCGa GAAATGCTCC AGCAGAGCAA AATCTTGAAA GtCATTCGCA 1300 

14201 1251 CATCTCCCGn GAAATGCTCC AGCAGAGCAA AATCTTGAAA GgCATTCGCA 1300 

14201.3 1251 1300 

14201.5 1251 1300 

14201.13 1251 1300 

1310 1320 1330 1340 1350 

Hsp 90 1301 AAAACATTGT TAAGaAGTGC CTTgAGCTCT TCTCTgAGCT GGCAGAAGaC J.350 

14201 1301 AAAACATTGT TAAGnAGTGC CTTnAGCTCT TCTCTnAGCT GGCAGAAGnC 1350 

14201.3 1301 1350 

14201.5 1301 1350 

14201,13 1301 1350 
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Hsp 90 

14201 

14201.3 

14201.5 

14201.13 



Hsp 90 

14201 

14201.3 

14201.5 

14201.13 



Hsp 90 

14201 

14201.3 

14201.5 

14201.13 



Hsp 90 

14201 

14201.3 

14201.5 

14201.13 



Hsp 90 

14201 

14201.3 

14201.5 

14201.13 



Hsp 90 

14201 

14201.3 

14201.5 

14201.13 



1360 1370 1380 1390 1400 

1351 AAGGAGAATT ACAAGAAATT CTATGAGGCA TTCTCTAAAA ATCTCAAGCT 

1351 AAGG-GGATT TCAAGAAATT CTTTGGGG— 

1351 

1351 

1351 

1410 1420 1430 1440 1450 

1401 TGGAATCCAC GAAGACTCCA CTAACCGCCG CCGCCTGTCT GAGCTGCTGC 

1401 

1401 

1401 

1401 

1460 1470 1480 1490 1500 

1451 GCTATCATAC CTCCCAGTCT GGAGATGAGA TGACATCTCT GTCAGAGTAT 

1451 

1451 

1451 

1451 

1510 1520 1530 1540 1550 

1501 GTTTCTCGCA TGAAGGAGAC ACAGAAGTCC ATCTATTACA TCACTGGTGA 
1501 ' 

1501 

1501 

1501 

1560 1570 1580 1590 1600 

1551 GAGCAAAGAG CAGGTGGCCA ACTCAGCTTT TGTGGAGCGA GTGCGGAAAC 

1551 

1551 •* 

1551 

1551 

1610 1620 1630 1640 .1650 

1601 GGGGCTTCGA GGTGGTATAT ATGACCGAGC CCATTGACGA GTACTGTGTG 
1601 

1601 

1601 

1601 



1400 
1400 
1400 
1400 
1400 



1450 
1450 
1450 
1450 
1450 



1500 
1500 
1500 
1500 
1500 



1550 
1550 
1550 
1550 
1550 



1600 
1600 
1600 
1600 
1600 



1650 
1650 
1650 
1650 
1650 
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Hsp 90 

14201 

14201.3 

14201.5 

14201.13 



Hsp 90 

14201 

14201-3 

14201.5 

14201.13 



Hsp 90 

14201 

14201,3 

14201.5 

14201.13 



Hsp 90 

14201 

14201.3 

14201.5 

14201.13 



Hsp 90 

14201 

14201.3 

14201.5 

14201.13 



Hsp 90 

14201 

14201.3 

14201.5 

14201.13 



HsD 90 

14*201 

14201.3 

14201,5 

14201.13 



1660 16^0 1680 1690 1700 

1651 CAGCAGCTCA AGGAATTTGA TGGGAAGAGC CTGGTCTCAG TTACCAAGGA 

1651 ~ 

1651 

1651 

1651 

1710 1720 1730 1740 1750 

1701 GGGTCTGGAG CTGCCTGAGG ATGAGGAGGA GAAGAAGAAG ATGGAAGAGA 

1701 

1701 

1701 

1701 

1760 1.770 1780 1790 1800 

1751 GCAAGGCAAA GTTTGAGAAC CTCTGCAAGC TCATGAAAGA AATCTTAGAT 

1751 

1751 

1751 

1751 

1810 X820 1830 1840 1850 

1801 AAGAAGGTTG AGAAGGTGAC AATCTCCAAT AGACTTGTGT CTTCACCTTG 

1801 

1801 

1801 

1801 

1860 1870 1880 1890 1900 

1851 CTGCATTGTG ACCAGCACCT ACGGCTGGAC AGCCAATATG GAGCGGATCA 

1851 

1851 

1851 

1851 

1910 1920 1930 1940 1950 

1901 TGAAAGCCCA GGCACTTCGG GACAACTCCA CCATGGGCTA TATGATGGCC 

1901 

1901 

1901 

1901 

1960 1970 1980 1990 2000 

1951 AAAAAGCACC TGGAGATCAA CCCTGACCAC CCCATTGTGG AGACGCTGCG 

.1951 

1951 

1951 

1951 



1700 
1700 
1700 
1700 
1700 



1750 
1750 
1750 
1750 
1750 



180O 
1800 
1800 
1800 
1800 



1850 
1850 
1850 
1850 
1850 



1900 
1900 
1900 
1900 
1900 
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1950 
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Hsp 90 

14201 

14201.3 

14201.5 

14201.13 



Hsp 90 

14201 

14201.3 

14201.5 

14201.13 



Hsp 90 

14201 

14201.3 

14201.5 

14201.13 



Hsp 90 

14201 

14201.3 

14201.5 

14201.13 



Hsp 90 

14201 

14201.3 

14201.5 

14201,13 



Hsp 90 

14201 

14201.3 

14201.5 

14201.13 



.Hsp 90 

14201 

14201.3 

14201.5 

14201.13 



2010 2020 2030 2040 2050 

2001 GCAGAAGGCT GAGGCCGACA AGAATGATAA GGCAGTTAAG GACCTGGTGG 

2001 

2001 

2001 

2001 : 

2060 2070 2080 2090 2100 

2051 TGCTGCTGTT TGAAACCGCC CTGCTATCTT CTGGCTTTTC CCTTGAGGAT 

2051 

2051 

2051 

2051 

2110 2120 2130 2140 2150 

2101 CCCCAGACCC ACTCCAACCG CATCTATCGC ATGATCA^GC TAGGTCTAGG 

2101 

2101 

2101 

2101 

2160 2170 2180 2190 2200 

2151 TATTGATGAA GATGAAGTGG CAGCAGAGGA ACCCAATGCT GCAGTTCCTG 

2151 

2151 

2151 

2151 

2210 2220 2230 2240 2250 

2201 ATGAGATCCC CCCTCTCGAG GGCGATGAGG ATGCGTCTCG CATGGAAa^A 

2201 

2201 

2201 

2201 

2260 2270 2280 2290 2300 

2251 GTCGATTAGG TTAGGAGTTC ATAGTTGGAA AAClgrGTGCC CTTGTATAGT 

2251 

2251 

2251 

2251 

2310 2320 2330 2340 2350 

2301 GTCCCCATGG GCTCCCACTG CAGCCTCGAG TGCCCCTGTC CCACCTGGCT 

2301 

2301 

2301 

2301 



2050 
2050 
2050 
2050 
2050 



2100 
2100 
2100 
2100 
2100 



2150 
2150 
2150 
2150 
2150 



2200 
2200 
2200 
2200 
2200 



2250 
2250 
2250 
2250 
2250 



2300 
2300 
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2360 2370 2380 2390 2400 

Hsp 90 2351 CCCCCTGCTG GTGTCTAGTG TTTTTTTCCC TCTCCTGTCC TTGTGTTGAA 2400 

14201 2351 2400 

14201.3 2351 2400 

14201.5 2351 2400 

14201.13 2351 2400 

2410 2420 2430 2440 2450 

Hso 90 2401 GGCAGTAAAC TAAGGGTGTC AAGCCCCATT CCCTCTCTAC TCTTGACAGC 2450 

14*201 2401 2450 

14201.3 2401 2450 

14201.5 2401 2450 

14201.13 2401 2450 

2460 2470 2480 2490 2500 

Hso 90 24 51 AGGATTGGAT GTTGTGTATT GTGGTTTATT TTATTTTCTT CATTTTGTTC 2500 

14201 2451 2500 

14201.3 2451 2500 

14201.5 2451 2500 

14201.13 2451 2500 

2510 2520 2530 254 0 2550 

Hsp 90 2501 TGAAATTAAA GTATGCAAAA TAAAGA.i^TAT GCCGTTTTTA TAG 2550 

14201 2501 2550 

14201.3 2501 2550 

14201.5 2501 2550 

14201.13 2501 *. 2550 
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10 20 30 40 50 

capthepsin 1 TCCGGCAACG CCAACCGCTC CGCTGCGCGC AGGCTGGGCT GCAGGCTCTC 50 

87058 1 50 

87058.6 1 ' 50 

87058.8 1 • 50 

87058,16 1 50 

60 70 80 90 100 

capthepsin 51 GGCTGCAGCG CTGGGCTGGT GTGCAGTGGT GCGACCACGG CTCACGGCAG 100 

87058 51 100 

87058.6 51 100 

87058.8 51 100 

87058.16 51 NCN GGTTGAGNAT TCGGACNAGT CCaz^AAACGT CCGGCAAGTC 100 

110 120 130 140 150 

capthepsin 101 CCTCAGCCAC CCAGATGTAA GCGATCTGGT TCCCACCTCA GCCTCCCGAG ISO 

87058 101 130 

87058.6 101 ISO 

87058.16 101 ACCCGCTCCG CTGNGCGCAG GCTGGGNTGC AGGCTCTCGG NTGCAGNGCT 150 

160 170 180 190 200 

capthepsin 151 TAGTGGATCT AGGATCCGGC TTCCAACATG TGGCAGcTCT GGGCCTCCCT 200 

87058 151 200 

87058.6 151 200 

87058.8 151 200 

87058.16 151 GGGTGGATCT AGGATCCGGC TTCCAACATG TGGCAGtTCT GGGCCTCCCT 200 

210 220 230 240 250 

capthepsin 201 CTGcTGCCTG CTGGTGTTGG cCA^TGCCCG GAGcAGGcCC TCTTTCCATC 250 

87058 201 250 

87058.6 201 250 

87058.8 201 250 

87058.16 201 CTGnTGCCTG CTGGTGTTGG aCAATGCCCG GAGgAGGnCC TCTTTCCATC 250 

260 270 280 290 300 

capthepsin 251 CCCTGTCGGA TGAGCTGGTC AaCTATGTCA ACAAACGGAA TACCACGTGG 300 

87058 251 300 

87058.6 251 300 

87058.8 251 300 

87058,16 251 CCCTGTCGGA TGAGCTGGTC AnCTATGTCA ACAAACGGAA TACCACGTGG 300 
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310 320 330 340 350 

capthepsin 301 cAGGCCGGaA ACAACTTCTA CAACGTGGAC ATGAGCTACT TGAaGAGGcT 350 

87058 301 350 

87058.6 301 350 

87058.8 301 350 

87058.16 301 nAGGCCGGgA ACAACTTCTA CAACGTGGAC ATaAGCTACT TGAnGAGGnT 350 

360 370 380 390 400 

capthepsin 351 ATGTGGTACC TTCCTGGGTG GGCCCAAGCC ACCCCAGAGA GTTATGTTTA 400 

87058 351 400 

87058.6 351 400 

87058.8 351 —GaGGTACC TTCCTGGGTG GGCCCAAGCC ACCCCAGAGA GTTATGTTTA 400 

87058.16 351 ATGTGGTACC pCCTGGGTG GGCCCAAGCC ACCCCAGAGA GTTNTGTTTA 400 

410 420 430 440 450 

capthepsin 401 CCGAGGACCT GAAGCTGCCT GCAAGCTTCG ATGCACGGGA ACAATGGCCA 450 

87058 401 450 

87058.6 401 450 

87058.8 401 CCGAGGACCT GAAGCTGCCT GCAAGCTTCG ATGCACGGGA ACAATGGCO^ 450 

87058.16 401 CCGAGGACCT GANGCTGCCT GCAAGCTTCG AaGgACGGGA ACAATGGCCA 450 

460 470 480 490 500 

capthepsin 451 CAGTGTCCCA CCATCAAAGA GATCAGAGAC CAGGGCTCCT GTGGCTCCTG 500 

87058 451 500 

87058.6 451 500 

87058.8 451 CAGTGTCCCA CCATCAAAGA GATCAGAGAC CAGGGNTCCT GTGGCTCCTG 500 

87058.16 451 CAGTGTCCCA CCATCAAAGA GATCAGAGAN CAGGGCTCCT GTGGNTCCTG 500 

510 520 530 540 550 

capthepsin 501 CTGGGCCTTC GGGGCTGTGG AAGCCATCTC TGACCGGATC TGCATCCACA 550 

87058 501 550 

87058.6 501 ' 550 

87058.8 501 CTGGGCCTTC GGGGCTGTGG AAGCCATCTC TGACCGGATC TGNATCCACA 550 

87058.16 501 CTGGGCCTcC GGGGCTGTGG AAGNCATCTC TGACCGaATC TGCATCCACA 550 

560 570 580 590 600 

capthepsin 551 CCAATGCGCA CGTCAGCGTG GAGGTGTCGG CCGAGGACCT GCTCACATGC 600 

87058 551 600 

87058.6 551 600 

87058.8 551 CCAATGCGCA CGTCAGCGTG GAGGTGTCGG CGGAGGAC-T GCTCACATGC 600 

87058.16 551 CCAATGNGCA CGTCAGCGTG GtGGTGTCGG NGGAGGACCT GaTCACCTNt 600 

610 620 630 640 650 

capthepsin 601 TGTGGCAGCA TGTGTGGGGA CGGCTGTAAT GGTGGCTATC CTGCTGAAGC 650 

87058 601 ■ 650 

87058.6 601 gTGAAGC 650 

87058.8 601 TGTGGCAGNA TGTGTGGGGA CGGCTGTAAT GGTGGCTATC CTGCTGAAGC 650 

87058.16 601 TGTGGtAGCA TGTGTGGGGA CGGCTGTAAT GGTGGtTATC CTGNTGAAGC 650 



FIGURE SB 



BNSDOCID: <WO 9638591 A 1_L> 



wo 96/38591 



PCTAUS96/08501 



17/20 



capthepsin 
87058 
87058. 6 
87058.8 
87058.16 



capthepsin 

87058 

87058.6 

87058.8 

87058.16 



capthepsin 

87058 

87058.6 

87058.8 

87056,16 



capthepsin 

87058 

87058.6 

87058.8 

87058.16 



capthepsin 

87058 
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87058.8 
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660 670 680 690 700 

651 TTGGAACTTC TGGACAAGAA AAGGCCTGGT TTCTGGTGGC CTCTATGAAT 

651 

651 TTGGAACTTC TGGACAAGAA AAGGCCTGGT TTCTGGTGGC CTCTATGAAT 

651 TTGGNACTTC TGGACAAGAA AAGGCCTGGT TTCTGGTGGC CTCTATGANT 

651 TNGGgNCTTC TNagaAAGAA AAGGCtNGtT TT~GGTGGC CT-TATGAcT 

710 720 730 740 750 

701 CCCATGTAGG GTGCAGACCG TACtCCATCC CTCCCTGTGA GCACCACGTC 

701 

701 CCCATGTAGG GTGCAGACCG TACTCCATCC CTCCCTGTGA GCACCACGTC 
701 CCCATGTAGG GTGTAGACCG TACTCCATCC CTCCCTGTGA GCACCACGTC 
701 CCCATGT 

760 770 780 790 800 

751 AACGGCTCCC GGCCCCCATG CACGGGGGAG GGAGATACCC CCAAGTGTAG 

751 

751 AACGGCTCCC GGCCCCCATG CACGGGGGAG GGAGATACCC CCAAGTGTAG 
751 AACGGtTCCC GGgCCCCATG CACGGNGGAG GGAGATACCC CCAAGTGTAa 
751 

810 820 830 840 850 

801 CAAGATCTGT GAGCCTGGCT ACAGCCCGAC CTACAAACAG GACAAGCACT 

gOi 

801 CAAGATCTGT GAGCCTGGCT ACAGCCCGAC CTACAAACAG GACAAGCACT 
801 CAAGATCTGT" GAGCCTGGgT ACAGtCCcga CcACAAACAG GAaAAGCACT 
801 

860 870 880 890 900 

851 ACGGATACAA TTCCTACAGC GTCTCCAATA GCGAGAAGGA CATCATGGCC 

851 — — ~ 

851 ACGGATACAA TTCCTACAGC GTCTCCAATA GCGAGAAGGA CATCATGGCC 
851 ACGGATACAA TTCCT-CAGN GTCTCCAATA GtGAGAAGGA CATCAT-GCC 
851 



700 
700 
700 
700 
700 
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750 
750 
750 
750 



800 
800 
800 
800 
800 
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850 
850 
850 
850 
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900 
900 
900 
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car)thepsin 

87058 
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87058.8 

87058.16 



ca-Dthepsin 

87058 

87058.6 

87058.8 

87058,16 



910 920 930 940 950 

901 GAGATCTACA AAAACGGCCC CGTGGAGGGA GCTTTCTCTG TGTATTCGGA 

901 GAGATCTACA AAAACGGCCC CGTGGAGGGA GCTTTCTCTG TGTATTCGGA 

901 GAGATCTACA AtAACGGC * * 

901 

960 970 980 990 1000 

951 CTTCCTGCTC TACAAGTCAG GAGTGTACCA ACACGTCACC ^GAGAGATGA 

Qcn — — — — — - ~— 

951 CTTCCTGCTC TACAAGTCAG GAGTGTACCA ACACGTCACC GGAGAGATGA 

951. 

951 



950 
950 
950 
950 
950 



1000 
1000 
1000 
1000 
1000 
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capthepsin 
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87058.6 
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capthepsin 

87058 
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1010 1020 1030 1040 1050 

1001 TGGGTGGCCA TGCCATCCGC ATCCTGGGCT GGGGAGTGGA GAATGGCACA 

1001 

1001 TGGGTGGCCA TGCCATCCGC ATCCTGGGCT GGGGAGTGGA GAATGGCACA 

1001 

1001 

1060 1070 1080 1090 1100 

1051 CCCTACTGGC TGGTTGCCAA CTCCTGGAAC ACTGACTGGG GTGACAATGG 

1051 cGg cagacGCCAA CTCCTGGAAC ACTGACTGGG GTGACAATGG 

1051 aCCTACTGGC TGGTTGgCAA CTCCTGGAAC ACTGACTGGG GTGACAATGG 

1051 

1051 
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1050 
1050 
1050 
1050 
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capthepsin 

87058 

87058.6 

87058.8 

87058.16 



1110 1120 1130 1140 1150 

1101 CTTCTTTAAA ATACTCAGAG GACAGGATCA CTGTGGAATC GAATCAGA.AG 
1101 CTTCTTTAAA ATACTCAGAG GACAGGTTCA CTGTGGAATC GAATCAGAAG 

1101 gXTC 

1101 

1101 

1160 1170 1180 1190 1200 

1151 TGGTGGCTGG AATTCCACGC ACCGATCAGT ACTGGGAAAA GATCTAATCT 
.1151 TGGTGGCTGG AATTCCACGC ACCGTTCAGT ACTGGGAAAA GNTCTAATCT 

1151 

1151 

1151 

1210 1220 1230 1240 1250 

1201 GCCGTGGGCC TGTCGTGCCA GTCCTGGGGG CGAGATCGGG GTAGAAATGC 
1201 GCCGTGGGCC TNTCGTGCCA GTCCTGGGGG CGAGATGGGG GTAGAAATGC 

1201 • - ' 

1201 ; 

1201 

1260 1270 1280 1290 1300 

1251 ATTTTATTCT TTAAGTTCAC GTAAGATACA AGTTTCAGgC AGGGTCTgAA 
1251 ATTTTATTCT TTAAGTTCAC GTAAGATACA AGTTTCAGaC AGGGTCTnAA 

1251 

1251 

1251 

1310 1320 1330 1340 1350 

1301 GGaCTGGaTT gGCCAAAcAT CAGACCTGTC TTCCAAGGAG ACCAAGTCCT 

1301 GGcCTGGnTT nGCCAAAnAT CAGACCTGT 

1301 

1301 

1301 



1150 
1150 
1150 
1150 
1150 



1200 
1200 
1200 
1200 
1200 



1250 
1250 
1250 
1250 
1250 
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1300 
1300 
1300 
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capthepsin 

87058 

87058,6 
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capthepsin 

87058 

37058.6 

87058.6 

87058-16 



1360 1370 1380 1390 1400 
1351 GGCTACATCC CAGCCTGTGG TTACAGTGCA GACAGGCCAT GTGAGCCACC 1400 
1351 1^00 

£351 : 

£351 1^^° 

i35i : 

1410 1420 1430 1440 1450 

1401 GCTGCCAGCA CAGAGCGTCC TTCCCCCTGT AGACTAGTGC CGTGGGAGTA 1450 

T-nn 14 50 

1401 -.rtCft 

- 1450 

1401 ^ACf\ 

14 50 

1401 

1460 1470 1480 1490 1500 

1451 CCTGCTGCCC AGCTGCTGTG GCCCCCTCCG TGATCCATCC ATCTCCAGGG 1500 

1451 • llll 

1451 

1451 llll 

1451 

1510 1520 1530 1540 1550 

1501 AGCAAGACAG AGACGCAGGA TGGAAAGCGG AGTTCCTAAC AGGATGAAAG 1550 

^501 :; :::::::::: :::::::::: £550 

1550 

1501. ^530 

1501 

1560 1570 1580 1590 1600 

1551 TTCCCCCATC AGTTCCCCCA GTACCTCCAA GCAAGTAGCT TTCCACATTT 1600 

1551 leoo 

1551 ;;;; ^^oo 

1551 

1551 

1610 1620 1630 1640 1650 

1601 GTCACAGAAA TCAGAGGAGA GATGGTGTTG GGAGCCCTIT GGAGAACGCC 1650 
1601 ;•; igso 

1601 ; 1650 

1601 : 1650 

1601 
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1660 1670 1680 1690 1700 

capthepsin 1651 AGTCTCCAGG TCCCCCTGCA TCTATCGAGT TTGCAATGTC ACAACCTCTC 1700 

87058 1651 I'^OO 

87058.6 1651 I'^OO 

87058.8 1651 : I'^OO 

87058.16 1651 1*700 

1710 1720 1730 1740 1750 

capthepsin 1701 TGATCTTGTG CTCAGCATGA TTCTTTAATA GAAGTTTTAT TTTTCGTXSCA 1750 

.87058 1701 1750 

87058.6 1701 1750 

87058.8 1701 1750 

87058.16 1701 1*750 

1760 1770 1780 1790 1800 

capthepsin 1751 CTCTGCTAAT CATGTGGGTG AGCCAGTGGA ACAGCGGGAG CCTGTGCTGG 1800 

87058 1751 1800 

87058.6 1751 1800 

87058.8 1751 1800 

87058,16 1751 1800 

1810 1820 1830 1840 1850 

capthepsin 1801 TTTGCAGATT GCCTCCTAAT GACGCGGCTC AAAAGGAAAC CAAGTGGTCA 1850 

87058 1801 1850 

87058.6 1801 1850 

87058.8 1801 1850 

87058.16 1801 1850 

1860 1870 1880 1890 1900 

capthepsin 1851 GGAGTTGTTT CTGACCCACT GATCTCTACT ACCACAAGGA AAATAGTTTA 1900 

87058 1851 1900 

87058.6 1851 1900 

87058.8 1851 1900 

87058.16 1851 1900 

1910 1920 1930 1940 1950 

capthepsin 1901 GGAGAAACCA GCTTTTACTG TTTTTGAAAA ATTACAGCTT CACCCTGTCA 1950 

87058 1901 1950 

87058.6 1901 1950 

87058.8 1901 1950 

87058.16 1901 1950 

1960 1970 1980 1990 2000 

capthepsin 1951 AGTTAACAAG GAATGCCTGT GCCAATAAAA GGTTTCTCCA ACTTGA 2000 

87058 1951 2000 

87058.6 1951 2000 

87058.8 1951 2000 

87058.16 1951 2000 
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2D 4D 50 

H30 90 i CTCCGGCCCA CXCTIC&SAC TCTCTGOoTA TCGGWAGCA ACC3CTACGTT 50 

1^2D1 L ' ' — ' ^ 

142Q1,3 X — ^ — — ^i^TGOGTA TCGGAAAGCA AGDCTACCXT SO 

1^201.5 1 — GTTGG-SJiC TOTCrCKTA TCGC^^CCA RdCCTACCTT 50 

i — -* — ' 50 

60 ID 80 50 aOQ 

fep 9C' la GCTC^CTATT ftCCTWfrATC CTtTTCnTT CA?.-i;ATa:CT GXCGWlGTGC . LOO 

14201 LOO 

1^201.3 - 51 GCTCACTATT ACCTAWtTC C^TTTCXr^TW CAA^aTGCCT GJiOGWiGTCC LOO 

14201.5 51 GCTCACTATT- ACCTATJE-ATC CTmCTTTT C^lAGAXCCCT BACGAA-StGC 100 

14201.13 51 ^ ^ ^ ™~ 

110 120 3l30 L^G J.50 

Hsp 90 101 ACCATGGAra GGAGGAGBTG GAEP.CTTTTG CCTTTCAGGC A<?.WirTGCC 

14201 101 ' 

1^201,3 101 AC»Tt;GAI5A GGAGBAGGTG CAEACTTTTC CCTmCAGGC ACAAATTdCC 15D 
14201-5 101 ACCATGGAG^k GGAGCAG&tG GAGw^CTTTTCJ CCrrTTCACGC AG/^AATTGCC ISP 
1420L.L3 101 — • 150 

J, 60 no IBO - ISO 200 

fi5P 90 151 CAACTCATCT CCCrCATtAT CaATACCTTC TATTTCCWCA AGGAGAtTTT 200 

14201 151 

1420L,3 151 CAACrC^^TCT CCCtCatCat OATACCTCC TATTCCAACA AGGAGATTMT 200 

14201,5 151 cwvCTCMiTr -scctcatcat c;*ArAccTCC TATn:cA;^.CA acsagatttt 2D0 
1«0L.13 151 — — - — : 

210 220 230 2AQ 550 

Ksp 90 201 CCITO&CGAG rtGfttCtCTA ATKTTCTGA tCOCTTGGAC At^GAYTCCCT 250 

14201 201 — 

1420L,3 201 CCTb>DG<;eAD TTGATCTCTA ATGCrTCTCH TGCCPCGGMC AAGATTCGCT 250 
14201,3 201 CCTTCGiSGAG TTGaTCTCTA ATGCTTCTCA tGO^TTCCAC AAGATTCCCT 250 
1420L.13 201 ■ 

260 2B0 29(> 300 

S5sp ^ 251 ATSAGAGCCT CWXJACCCT Tr:GfiAGTrGC A[:A<53TGGTWi A£?AlSCnGA>A 300 

14201. 231 — " 

14201.2. 251 AY<;>.taA<;CCT CACAGfcCCCT TCfiAAGIVGC TCA<?CGGCAA WGABCtSAAA 300 
14201.5 251 ATSAGAGCCT WZIt/^f^OZUT TCGWhGTTGC AWCTGGTAA A-SaGCTGAAA 30O 
14201.13 251 — — 
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3L0 320 330 2 AO 35l> 

Ksp » SOL ATTGACATCA TCCCCAACCC TCACGAACGT ACCCTGACIT IGGTACJ^CAC 250 

1^201 301 — ^^^^ 350 

14201,3 301 ATTGftCATCA TC^lCCAftCCC TCACGP-P-CGT h^lCCTGACTT TGCTftGACAC 350 

14201.5 30X ATTGACATCA TCXCAACCC TCW;C^P.C^;T ACCCTCACTT TCGTftEACAC 350 

14201.13 301 " — 350 

i60 370 390 390 ^OO 

tl?p 90 351 AGGC:A:nrGGC AK^-.lXWlM CT^LtLTCTCAT ;'„^aWATTtG GGAACCiMTG 400 

^4201 351 ™ 400 

14ZD1.3 351 AG02ATTCGC AT^AeecAAG CTCAcCTCAT NfcrtTTATTcG CCgAftCcaTt 40O 

1^201-5 3S1 ?\GGCArcGGC ATC5AK^*JW3 CTGaTCTCAT AAp^TAATTj^G GGAACCATTG 4 DO 

14201-13 3S1 — — — . " <00 

410 q20 430 ^flO 450 

Hsp 90 401 CCAAGTCTCC TACTAA;».BCA TTChTCBACC CTCTTCAGGC TGGTCCJi'SAC 430 

14201 401 " 450 

14201-3 401 CCAAGTCTTG TWCTAAJ^OTA TTGATGSAG6 CT'CTWCACTO TCCcCCAGAC 450 

1^201-5 401 NCX^GTCTCC TACTAWX^JC^ TTCAT&GAGC CTCTTCACOC TCffTOCAGAC 450 

14201-13 401 ^ — 450 

4«Q 470 490 490 500 

H4p 90 451 ATCtCCArGA TTGGCWCTT tCGTOTTGGC TttTATTCTG CCmCTTeCT 500 

10201 451 ^ 

14201.3 451 AXCTCCWGA TTNCGCAErrr OTGTGTTCqC TTnTATTCTG Cc'cACTTGGr 5O0 

14201-5 451 ATCTCCATC^ TTGGGCAirrT GGGTCTTGnC TTnTATTTrXG OCTcCTlGGr 500 

1420:t,13 451 — 50D 

510 S20 530 540 550 

i*ap 90 501 CGCAGAGAAA GTOSTI^TGA TCAGftAAGCA CAAOEATCAT CAacAGTATC 5S0 

14201 SDl ~ 550 

14201.. 3 501 GGCAGAS^kAA HHT 550 

14201.5 5.;)1 «3CAeA<?Ws GTWGtmGTEA TCA , 550 

14501.13 SOl — ^ ~— — TT GAgnADTATC 550 

560 S70 5$0 590 6Q0 

Rsp *0 551 CTbgG^AGTc TtCTGcTGGA [^IJITOCTTCA CTgfti^CCTtSC TCACcATGCT SOO 

l«Dl . 551 — fiOO 

14201-3 551 600 

1420X-5 5S1 600 

14 201,, 13 551 -TcoGr\AGT- TaCTCnT6I?A CCXTCCTTCA CTnilGCGTCC TCAC-AXOTT 600 

610 £20 630 640 ^50 

l^p 50 60X GACCCCATtC CcAt<5GGTAC CAaAGKAxC CTCdy^TCTtA AAGW^GATC^ 650 

14201 601 ^ — 650 

l.«0i.3 $Ql £50 

14201.5 601 t^U 

14201,13 601 GAGCCCATnG GgAggCGTAC CAnACTGATC CTCCATCTCA AAUAAGATCA 650 
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Hsp 90 651 CACaGAGTAC CTAGha^AijA ■SCCCGSTTWA AGaP-CTAGK ^«aCA«CCArr ■?0O 

14201 651 " — " 70a 

H201.3 631 , 700 

1^201-5 65L , 70D 

1^^201, ili 651 ^CA-SaGTAC CTA&Aj^GP-GA QKQ^fSTCfijK AGnXiGTAGTC AI;G?irvG'::ATc 700 
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